465 resultados para visual representation
Resumo:
The performance of visual speech recognition (VSR) systems are significantly influenced by the accuracy of the visual front-end. The current state-of-the-art VSR systems use off-the-shelf face detectors such as Viola- Jones (VJ) which has limited reliability for changes in illumination and head poses. For a VSR system to perform well under these conditions, an accurate visual front end is required. This is an important problem to be solved in many practical implementations of audio visual speech recognition systems, for example in automotive environments for an efficient human-vehicle computer interface. In this paper, we re-examine the current state-of-the-art VSR by comparing off-the-shelf face detectors with the recently developed Fourier Lucas-Kanade (FLK) image alignment technique. A variety of image alignment and visual speech recognition experiments are performed on a clean dataset as well as with a challenging automotive audio-visual speech dataset. Our results indicate that the FLK image alignment technique can significantly outperform off-the shelf face detectors, but requires frequent fine-tuning.
Resumo:
Background Standard operating procedures state that police officers should not drive while interacting with their mobile data terminal (MDT) which provides in-vehicle information essential to police work. Such interactions do however occur in practice and represent a potential source of driver distraction. The MDT comprises visual output with manual input via touch screen and keyboard. This study investigated the potential for alternative input and output methods to mitigate driver distraction with specific focus on eye movements. Method Nineteen experienced drivers of police vehicles (one female) from the NSW Police Force completed four simulated urban drives. Three drives included a concurrent secondary task: imitation licence plate search using an emulated MDT. Three different interface methods were examined: Visual-Manual, Visual-Voice, and Audio-Voice (“Visual” and “Audio” = output modality; “Manual” and “Voice” = input modality). During each drive, eye movements were recorded using FaceLAB™ (Seeing Machines Ltd, Canberra, ACT). Gaze direction and glances on the MDT were assessed. Results The Visual-Voice and Visual-Manual interfaces resulted in a significantly greater number of glances towards the MDT than Audio-Voice or Baseline. The Visual-Manual and Visual-Voice interfaces resulted in significantly more glances to the display than Audio-Voice or Baseline. For longer duration glances (>2s and 1-2s) the Visual-Manual interface resulted in significantly more fixations than Baseline or Audio-Voice. The short duration glances (<1s) were significantly greater for both Visual-Voice and Visual-Manual compared with Baseline and Audio-Voice. There were no significant differences between Baseline and Audio-Voice. Conclusion An Audio-Voice interface has the greatest potential to decrease visual distraction to police drivers. However, it is acknowledged that an audio output may have limitations for information presentation compared with visual output. The Visual-Voice interface offers an environment where the capacity to present information is sustained, whilst distraction to the driver is reduced (compared to Visual-Manual) by enabling adaptation of fixation behaviour.
Resumo:
The appropriateness of applying drink driving legislation to motorcycle riding has been questioned as there may be fundamental differences in the effects of alcohol on driving and motorcycling. It has been suggested that alcohol may redirect riders’ focus from higher-order cognitive skills such as cornering, judgement and hazard perception, to more physical skills such as maintaining balance. To test this hypothesis, the effects of low doses of alcohol on balance ability were investigated in a laboratory setting. The static balance of twenty experienced and twenty novice riders was measured while they performed either no secondary task, a visual (search) task, or a cognitive (arithmetic) task following the administration of alcohol (0%, 0.02%, and 0.05% BAC). Subjective ratings of intoxication and balance impairment increased in a dose-dependent manner in both novice and experienced motorcycle riders, while a BAC of 0.05%, but not 0.02%, was associated with impairments in static balance ability. This balance impairment was exacerbated when riders performed a cognitive, but not a visual, secondary task. Likewise, 0.05% BAC was associated with impairments in novice and experienced riders’ performance of a cognitive, but not a visual, secondary task, suggesting that interactive processes underlie balance and cognitive task performance. There were no observed differences between novice vs. experienced riders on static balance and secondary task performance, either alone or in combination. Implications for road safety and future ‘drink riding’ policy considerations are discussed.
Resumo:
This study investigated how and to what degree “hybrid photography”—the simultaneous use of indexical and fictional properties and strategies— innovates the representation of animals within animalcentric, ecocentric frameworks. Design theory structured this project’s Practice-led, Visual research methodology framework. Grounded theory processes articulated emerging categories of hybrid photography through systematically and comparatively treating animal photography works for reflexive analysis. Design theory then applied and clarified categories, developing practice that re-visualised shark perspectives as new ecological discourse. Shadows, a creative practice installation, realised a full-scale photographic investigation into shark and marine animal realities of a specific environment—Heron Island and Gladstone, Great Barrier Reef—facing ecological crisis from dredging and development at Gladstone Harbour. Works rendered and explored hybrid photography’s capacity for illuminating nonhuman animals, in particular, sharks, and comprise 65% of this project’s weighting. This exegetical paper offers a definition, strategies and evaluation of hybrid photography in unsettling animal perspectives as effective ecological discourse, and comprises 35%.
Resumo:
This study considers the challenges in representing women from other cultures in the crime fiction genre. The study is presented in two parts; an exegesis and a creative practice component consisting of a full length crime fiction novel, Batafurai. The exegesis examines the historical period of a section of the novel—post-war Japan—and how the area of research known as Occupation Studies provides an insight into the conditions of women during this period. The exegesis also examines selected postcolonial theory and its exposition of representations of the 'other' as a western construct designed to serve Eurocentric ends. The genre of crime fiction is reviewed, also, to determine how characters purportedly representing Oriental cultures are constricted by established stereotypes. Two case studies are examined to investigate whether these stereotypes are still apparent in contemporary Australian crime fiction. Finally, I discuss my own novel, Batafurai, to review how I represented people of Asian background, and whether my attempts to resist stereotype were successful. My conclusion illustrates how novels written in the crime fiction genre are reliant on strategies that are action-focused, rather than character-based, and thus often use easily recognizable types to quickly establish frameworks for their stories. As a sub-set of popular fiction, crime fiction has a tendency to replicate rather than challenge established stereotypes. Where it does challenge stereotypes, it reflects a territory that popular culture has already visited, such as the 'female', 'black' or 'gay' detective. Crime fiction also has, as one of its central concerns, an interest in examining and reinforcing the notion of societal order. It repeatedly demonstrates that crime either does not pay or should not pay. One of the ways it does this is to contrast what is 'good', known and understood with what is 'bad', unknown, foreign or beyond our normal comprehension. In western culture, the east has traditionally been employed as the site of difference, and has been constantly used as a setting of contrast, excitement or fear. Crime fiction conforms to this pattern, using the east to add a richness and depth to what otherwise might become a 'dry' tale. However, when used in such a way, what is variously eastern, 'other' or Oriental can never be paramount, always falling to secondary side of the binary opposites (good/evil, known/unknown, redeemed/doomed) at work. In an age of globalisation, the challenge for contemporary writers of popular fiction is to be responsive to an audience that demands respect for all cultures. Writers must demonstrate that they are sensitive to such concerns and can skillfully manage the tensions caused by the need to deliver work that operates within the parameters of the genre, and the desire to avoid offence to any cultural or ethnic group. In my work, my strategy to manage these tensions has been to create a back-story for my characters of Asian background, developing them above mere genre types, and to situate them with credibility in time and place through appropriate historical research.
Resumo:
Maternally inherited diabetes and deafness (MIDD) is an autosomal dominant inherited syndrome caused by the mitochondrial DNA (mtDNA) nucleotide mutation A3243G. It affects various organs including the eye with external ophthalmoparesis, ptosis, and bilateral macular pattern dystrophy.1, 2 The prevalence of retinal involvement in MIDD is high, with 50% to 85% of patients exhibiting some macular changes.1 Those changes, however, can vary between patients and within families dramatically based on the percentage of retinal mtDNA mutations, making it difficult to give predictions on an individual’s visual prognosis...
Resumo:
In recent years there has been a noticeable move by various public institutions, such as public service broadcasters and community media organisations, to capture and disseminate the voices and viewpoints of ‘ordinary people’ through inviting them to share stories about their lives. One of the foremost objectives of many such projects is to provide under-represented individuals and groups with an opportunity to express and represent themselves; as such, the capture and broadcast of ‘authentic voices’ is a central value. This paper discusses the notion of ‘authentic voice’, and questions the framing role of public media organisations in storytelling projects that aim to provide individuals with space for self-expression and self-representation. It considers the ways in which tensions arise on multiple levels when individuals are asked to express and represent themselves within projects and spaces that are managed by institutions. This paper begins by discussing the challenges and opportunities that arise within storytelling projects that are facilitated by public institutions and community media arts organisations, and that aim to amplify the voices of “ordinary people” (Thumim, 2009). It examines ways in which ‘voice’ is facilitated, curated, broadcast and distributed within such projects, particularly questioning the ways in which project facilitation and the curation of stories for public broadcast can both help and hinder the amplification of ‘authentic voice’. Furthermore, we seek to discuss how ‘authentic voice’ is defined, and what is involved in the process of amplification. The paper moves on to discuss a case study in order to demonstrate some of the tensions that are evident within a storytelling project that is managed by a public institution – Australia’s national broadcaster – and the ways these tensions impact upon the capture and broadcast of an ‘authentic voice’ for project participants. The Australian Broadcasting Corporation’s (ABC) ‘Heywire’ project is a storytelling competition and website that aims to ‘give voice’ to 16-22 year olds who live in rural, regional and remote parts of Australia. Looking at tensions that exist on organisational, political and philosophical levels within the Heywire project reveals a number of conflicts of interest and objectives between the institution and project participants. This leads us to question whether institutionally-managed storytelling projects can effectively support individuals to have an ‘authentic voice’, and whether struggles of aims and objectives diminish the personal benefits that people may derive from expressing and representing themselves within such projects.
Resumo:
Purpose To design and manufacture lenses to correct peripheral refraction along the horizontal meridian and to determine whether these resulted in noticeable improvements in visual performance. Method Subjective refraction of a low myope was determined on the basis of best peripheral detection acuity along the horizontal visual field out to ±30° for both horizontal and vertical gratings. Subjective refraction was compared to objective refractions using a COAS-HD aberrometer. Special lenses were made to correct peripheral refraction, based on designs optimized with and without smoothing across a 3 mm diameter square aperture. Grating detection was retested with these lenses. Contrast thresholds of 1.25’ spots were determined across the field for the conditions of best correction, on-axis correction, and the special lenses. Results The participant had high relative peripheral hyperopia, particularly in the temporal visual field (maximum 2.9 D). There were differences > 0.5D between subjective and objective refractions at a few field angles. On-axis correction reduced peripheral detection acuity and increased peripheral contrast threshold in the peripheral visual field, relative to the best correction, by up to 0.4 and 0.5 log units, respectively. The special lenses restored most of the peripheral vision, although not all at angles to ±10°, and with the lens optimized with aperture-smoothing possibly giving better vision than the lens optimized without aperture-smoothing at some angles. Conclusion It is possible to design and manufacture lenses to give near optimum peripheral visual performance to at least ±30° along one visual field meridian. The benefit of such lenses is likely to be manifest only if a subject has a considerable relative peripheral refraction, for example of the order of 2 D.
Resumo:
In this paper, we present SMART (Sequence Matching Across Route Traversals): a vision- based place recognition system that uses whole image matching techniques and odometry information to improve the precision-recall performance, latency and general applicability of the SeqSLAM algorithm. We evaluate the system’s performance on challenging day and night journeys over several kilometres at widely varying vehicle velocities from 0 to 60 km/h, compare performance to the current state-of- the-art SeqSLAM algorithm, and provide parameter studies that evaluate the effectiveness of each system component. Using 30-metre sequences, SMART achieves place recognition performance of 81% recall at 100% precision, outperforming SeqSLAM, and is robust to significant degradations in odometry.
Resumo:
The ability to automate forced landings in an emergency such as engine failure is an essential ability to improve the safety of Unmanned Aerial Vehicles operating in General Aviation airspace. By using active vision to detect safe landing zones below the aircraft, the reliability and safety of such systems is vastly improved by gathering up-to-the-minute information about the ground environment. This paper presents the Site Detection System, a methodology utilising a downward facing camera to analyse the ground environment in both 2D and 3D, detect safe landing sites and characterise them according to size, shape, slope and nearby obstacles. A methodology is presented showing the fusion of landing site detection from 2D imagery with a coarse Digital Elevation Map and dense 3D reconstructions using INS-aided Structure-from-Motion to improve accuracy. Results are presented from an experimental flight showing the precision/recall of landing sites in comparison to a hand-classified ground truth, and improved performance with the integration of 3D analysis from visual Structure-from-Motion.
Resumo:
OBJECTIVE To compare different reliability coefficients (exact agreement, and variations of the kappa (generalised, Cohen's and Prevalence Adjusted and Biased Adjusted (PABAK))) for four physiotherapists conducting visual assessments of scapulae. DESIGN Inter-therapist reliability study. SETTING Research laboratory. PARTICIPANTS 30 individuals with no history of neck or shoulder pain were recruited with no obvious significant postural abnormalities. MAIN OUTCOME MEASURES Ratings of scapular posture were recorded in multiple biomechanical planes under four test conditions (at rest, and while under three isometric conditions) by four physiotherapists. RESULTS The magnitude of discrepancy between the two therapist pairs was 0.04 to 0.76 for Cohen's kappa, and 0.00 to 0.86 for PABAK. In comparison, the generalised kappa provided a score between the two paired kappa coefficients. The difference between mean generalised kappa coefficients and mean Cohen's kappa (0.02) and between mean generalised kappa and PABAK (0.02) were negligible, but the magnitude of difference between the generalised kappa and paired kappa within each plane and condition was substantial; 0.02 to 0.57 for Cohen's kappa and 0.02 to 0.63 for PABAK, respectively. CONCLUSIONS Calculating coefficients for therapist pairs alone may result in inconsistent findings. In contrast, the generalised kappa provided a coefficient close to the mean of the paired kappa coefficients. These findings support an assertion that generalised kappa may lead to a better representation of reliability between three or more raters and that reliability studies only calculating agreement between two raters should be interpreted with caution. However, generalised kappa may mask more extreme cases of agreement (or disagreement) that paired comparisons may reveal.
Resumo:
This paper presents a long-term experiment where a mobile robot uses adaptive spherical views to localize itself and navigate inside a non-stationary office environment. The office contains seven members of staff and experiences a continuous change in its appearance over time due to their daily activities. The experiment runs as an episodic navigation task in the office over a period of eight weeks. The spherical views are stored in the nodes of a pose graph and they are updated in response to the changes in the environment. The updating mechanism is inspired by the concepts of long- and short-term memories. The experimental evaluation is done using three performance metrics which evaluate the quality of both the adaptive spherical views and the navigation over time.
Resumo:
Purpose: Changes in pupil size and shape are relevant for peripheral imagery by affecting aberrations and how much light enters and/or exits the eye. The purpose of this study is to model the pattern of pupil shape across the complete horizontal visual field and to show how the pattern is influenced by refractive error. Methods: Right eyes of thirty participants were dilated with 1% cyclopentolate and images were captured using a modified COAS-HD aberrometer alignment camera along the horizontal visual field to ±90°. A two lens relay system enabled fixation at targets mounted on the wall 3m from the eye. Participants placed their heads on a rotatable chin rest and eye rotations were kept to less than 30°. Best-fit elliptical dimensions of pupils were determined. Ratios of minimum to maximum axis diameters were plotted against visual field angle. Results: Participants’ data were well fitted by cosine functions, with maxima at (–)1° to (–)9° in the temporal visual field and widths 9% to 15% greater than predicted by the cosine of the field angle . Mean functions were 0.99cos[( + 5.3)/1.121], R2 0.99 for the whole group and 0.99cos[( + 6.2)/1.126], R2 0.99 for the 13 emmetropes. The function peak became less temporal, and the width became smaller, with increase in myopia. Conclusion: Off-axis pupil shape changes are well described by a cosine function which is both decentered by a few degrees and flatter by about 12% than the cosine of the viewing angle, with minor influences of refraction.
Resumo:
Efficient and effective feature detection and representation is an important consideration when processing videos, and a large number of applications such as motion analysis, 3D scene understanding, tracking etc. depend on this. Amongst several feature description methods, local features are becoming increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational complexity, their performance is still too limited for real world applications. Furthermore, rapid increases in the uptake of mobile devices has increased the demand for algorithms that can run with reduced memory and computational requirements. In this paper we propose a semi binary based feature detectordescriptor based on the BRISK detector, which can detect and represent videos with significantly reduced computational requirements, while achieving comparable performance to the state of the art spatio-temporal feature descriptors. First, the BRISK feature detector is applied on a frame by frame basis to detect interest points, then the detected key points are compared against consecutive frames for significant motion. Key points with significant motion are encoded with the BRISK descriptor in the spatial domain and Motion Boundary Histogram in the temporal domain. This descriptor is not only lightweight but also has lower memory requirements because of the binary nature of the BRISK descriptor, allowing the possibility of applications using hand held devices.We evaluate the combination of detectordescriptor performance in the context of action classification with a standard, popular bag-of-features with SVM framework. Experiments are carried out on two popular datasets with varying complexity and we demonstrate comparable performance with other descriptors with reduced computational complexity.
Resumo:
This paper introduces an improved line tracker using IMU and vision data for visual servoing tasks. We utilize an Image Jacobian which describes motion of a line feature to corresponding camera movements. These camera motions are estimated using an IMU. We demonstrate impacts of the proposed method in challenging environments: maximum angular rate ~160 0/s, acceleration ~6m /s2 and in cluttered outdoor scenes. Simulation and quantitative tracking performance comparison with the Visual Servoing Platform (ViSP) are also presented.