495 resultados para Visual cue integration
Resumo:
Purpose: Age-related macular degeneration (AMD) is the leading cause of irreversible visual impairment among older adults. This study explored the relationship between AMD, falls risk and other injuries and identified visual risk factors for these adverse events. Methods: Participants included 76 community-dwelling individuals with a range of severity of AMD (mean age, 77.0±6.9 years). Baseline assessment included binocular visual acuity, contrast sensitivity and merged visual fields. Participants completed monthly falls and injury diaries for one year following the baseline assessment. Results: Overall, 74% of participants reported having either a fall, injurious fall or other injury. Fifty-four percent of participants reported a fall and 30% reported more than one fall; of the 102 falls reported, 63% resulted in an injury. Most occurred outdoors (52%), between late morning and late afternoon (61%) and when navigating on level ground (62%). The most common non-fall injuries were lacerations (36%) and collisions with an object (35%). Reduced contrast sensitivity and visual acuity were associated with increased fall rate, after controlling for age, gender, cognitive function, cataract severity and self-reported physical function. Reduced contrast sensitivity was the only significant predictor of falls and other injuries. Conclusion: Among older adults with AMD, increased visual impairment was significantly associated with an increased incidence of falls and other injuries. Reduced contrast sensitivity was significantly associated with increased rates of falls, injurious falls and injuries, while reduced visual acuity was only associated with increased falls risk. These findings have important implications for the assessment of visually impaired older adults.
Resumo:
Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.
Resumo:
The performance of automatic speech recognition systems deteriorates in the presence of noise. One known solution is to incorporate video information with an existing acoustic speech recognition system. We investigate the performance of the individual acoustic and visual sub-systems and then examine different ways in which the integration of the two systems may be performed. The system is to be implemented in real time on a Texas Instruments' TMS320C80 DSP.
Resumo:
The OED reminds us as surely as Ovid that a labyrinth is a “structure consisting of a number of intercommunicating passages arranged in bewildering complexity, through which it is it difficult or impossible to find one’s way without guidance”. Both Shaun Tan’s The Arrival (2006) and Matt Ottley’s Requiem for a Beast: A Work for Image, Word and Music (2007) mark a kind of labyrinthine watershed in Australian children’s literature. Deploying complex, intercommunicating logics of story and literacy, these books make high demands of their reader but also offer guidance for the successful navigation of their stories; for their protagonists as surely as for readers. That the shared logic of navigation in each book is literacy as privileged form of meaning-making is not surprising in the sense that within “a culture deeply invested in myths of individualism and self-sufficiency, it is easy to see why literacy is glorified as an attribute of individual control and achievement” (Williams and Zenger 166). The extent to which these books might be read as exemplifying desired norms of contemporary Australian culture seems to be affirmed by the fact of Tan and Ottley winning the Australian “Picture Book of the Year” prize awarded by the Children’s Book Council of Australia in 2007 and 2008 respectively. However, taking its cue from Ottley’s explicit intertextual use of the myth of Theseus and from Tan’s visual rhetoric of lostness and displacement, this paper reads these texts’ engagement with tropes of “literacy” in order to consider the ways in which norms of gender and culture seemingly circulated within these texts might be undermined by constructions of “nation” itself as a labyrinth that can only partly be negotiated by a literate subject. In doing so, I argue that these picture books, to varying degrees, reveal a perpetuation of the “literacy myth” (Graff 12) as a discourse of safety and agency but simultaneously bear traces of Ariadne’s story, wherein literacy alone is insufficient for safe navigation of the labyrinth of culture.
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
Micro aerial vehicles (MAVs) are a rapidly growing area of research and development in robotics. For autonomous robot operations, localization has typically been calculated using GPS, external camera arrays, or onboard range or vision sensing. In cluttered indoor or outdoor environments, onboard sensing is the only viable option. In this paper we present an appearance-based approach to visual SLAM on a flying MAV using only low quality vision. Our approach consists of a visual place recognition algorithm that operates on 1000 pixel images, a lightweight visual odometry algorithm, and a visual expectation algorithm that improves the recall of place sequences and the precision with which they are recalled as the robot flies along a similar path. Using data gathered from outdoor datasets, we show that the system is able to perform visual recognition with low quality, intermittent visual sensory data. By combining the visual algorithms with the RatSLAM system, we also demonstrate how the algorithms enable successful SLAM.
Resumo:
Diabetes is an increasingly prevalent disease worldwide. Providing early management of the complications can prevent morbidity and mortality in this population. Peripheral neuropathy, a significant complication of diabetes, is the major cause of foot ulceration and amputation in diabetes. Delay in attending to complication of the disease contributes to significant medical expenses for diabetic patients and the community. Early structural changes to the neural components of the retina have been demonstrated to occur prior to the clinically visible retinal vasculature complication of diabetic retinopathy. Additionally visual functionloss has been shown to exist before the ophthalmoscopic manifestations of vasculature damage. The purpose of this thesis was to evaluate the relationship between diabetic peripheral neuropathy and both retinal structure and visual function. The key question was whether diabetic peripheral neuropathy is the potential underlying factor responsible for retinal anatomical change and visual functional loss in people with diabetes. This study was conducted on a cohort with type 2 diabetes. Retinal nerve fibre layer thickness was assessed by means of Optical Coherence Tomography (OCT). Visual function was assessed using two different methods; Standard Automated Perimetry (SAP) and flicker perimetry were performed within the central 30 degrees of fixation. The level of diabetic peripheral neuropathy (DPN) was assessed using two techniques - Quantitative Sensory Testing and Neuropathy Disability Score (NDS). These techniques are known to be capable of detecting DPN at very early stages. NDS has also been shown as a gold standard for detecting 'risk of foot ulceration'. Findings reported in this thesis showed that RNFL thickness, particularly in the inferior quadrant, has a significant association with severity of DPN when the condition has been assessed using NDS. More specifically it was observed that inferior RNFL thickness has the ability to differentiate individuals who are at higher risk of foot ulceration from those who are at lower risk, indicating that RNFL thickness can predict late-staged DPN. Investigating the association between RNFL and QST did not show any meaningful interaction, which indicates that RNFL thickness for this cohort was not as predictive of neuropathy status as NDS. In both of these studies, control participants did not have different results from the type 2 cohort who did not DPN suggesting that RNFL thickness is not a marker for diagnosing DPN at early stages. The latter finding also indicated that diabetes per se, is unlikely to affect the RNFL thickness. Visual function as measured by SAP and flicker perimetry was found to be associated with severity of peripheral neuropathy as measured by NDS. These findings were also capable of differentiating individuals at higher risk of foot ulceration; however, visual function also proved not to be a maker for early diagnosis of DPN. It was found that neither SAP, nor flicker sensitivity have meaningful associations with DPN when neuropathy status was measured using QST. Importantly diabetic retinopathy did not explain any of the findings in these experiments. The work described here is valuable as no other research to date has investigated the association between diabetic peripheral neuropathy and either retinal structure or visual function.
Resumo:
Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker’s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker’s face may not always be frontal.
Resumo:
In this paper, we present a method for the recovery of position and absolute attitude (including pitch, roll and yaw) using a novel fusion of monocular Visual Odometry and GPS measurements in a similar manner to a classic loosely-coupled GPS/INS error state navigation filter. The proposed filter does not require additional restrictions or assumptions such as platform-specific dynamics, map-matching, feature-tracking, visual loop-closing, gravity vector or additional sensors such as an IMU or magnetic compass. An observability analysis of the proposed filter is performed, showing that the scale factor, position and attitude errors are fully observable under acceleration that is non-parallel to velocity vector in the navigation frame. The observability properties of the proposed filter are demonstrated using numerical simulations. We conclude the article with an implementation of the proposed filter using real flight data collected from a Cessna 172 equipped with a downwards-looking camera and GPS, showing the feasibility of the algorithm in real-world conditions.
Resumo:
We modified a commercial Hartmann-Shack aberrometer and used it to measure ocular aberrations across the central 42º horizontal x 32º vertical visual fields of five young emmetropic subjects. Some Zernike aberration coefficients show coefficient field distributions that were similar to the field dependence predicted by Seidel theory (astigmatism, oblique astigmatism, horizontal coma, vertical coma), but defocus did not demonstrate such similarity.
Resumo:
This article presents a visual servoing system to follow a 3D moving object by a Micro Unmanned Aerial Vehicle (MUAV). The presented control strategy is based only on the visual information given by an adaptive tracking method based on the colour information. A visual fuzzy system has been developed for servoing the camera situated on a rotary wing MAUV, that also considers its own dynamics. This system is focused on continuously following of an aerial moving target object, maintaining it with a fixed safe distance and centred on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigation
Resumo:
To address issues of divisive ideologies in the Mathematics Education community and to subsequently advance educational practice, an alternative theoretical framework and operational model is proposed which represents a consilience of constructivist learning theories whilst acknowledging the objective but improvable nature of domain knowledge. Based upon Popper’s three-world model of knowledge, the proposed theory supports the differentiation and explicit modelling of both shared domain knowledge and idiosyncratic personal understanding using a visual nomenclature. The visual nomenclature embodies Piaget’s notion of reflective abstraction and so may support an individual’s experience-based transformation of personal understanding with regards to shared domain knowledge. Using the operational model and visual nomenclature, seminal literature regarding early-number counting and addition was analysed and described. Exemplars of the resultant visual artefacts demonstrate the proposed theory’s viability as a tool with which to characterise the reflective abstraction-based organisation of a domain’s shared knowledge. Utilising such a description of knowledge, future research needs to consider the refinement of the operational model and visual nomenclature to include the analysis, description and scaffolded transformation of personal understanding. A detailed model of knowledge and understanding may then underpin the future development of educational software tools such as computer-mediated teaching and learning environments.