902 resultados para Visual Information


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents a 100 Hz monocular position based visual servoing system to control a quadrotor flying in close proximity to vertical structures approximating a narrow, locally linear shape. Assuming the object boundaries are represented by parallel vertical lines in the image, detection and tracking is achieved using Plücker line representation and a line tracker. The visual information is fused with IMU data in an EKF framework to provide fast and accurate state estimation. A nested control design provides position and velocity control with respect to the object. Our approach is aimed at high performance on-board control for applications allowing only small error margins and without a motion capture system, as required for real world infrastructure inspection. Simulated and ground-truthed experimental results are presented.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Purpose Little is known about the prevalence of refractive error, binocular vision, and other visual conditions in Australian Indigenous children. This is important given the association of these visual conditions with reduced reading performance in the wider population, which may also contribute to the suboptimal reading performance reported in this population. The aim of this study was to develop a visual profile of Queensland Indigenous children. Methods Vision testing was performed on 595 primary schoolchildren in Queensland, Australia. Vision parameters measured included visual acuity, refractive error, color vision, nearpoint of convergence, horizontal heterophoria, fusional vergence range, accommodative facility, AC/A ratio, visual motor integration, and rapid automatized naming. Near heterophoria, nearpoint of convergence, and near fusional vergence range were used to classify convergence insufficiency (CI). Results Although refractive error (Indigenous, 10%; non-Indigenous, 16%; p = 0.04) and strabismus (Indigenous, 0%; non-Indigenous, 3%; p = 0.03) were significantly less common in Indigenous children, CI was twice as prevalent (Indigenous, 10%; non-Indigenous, 5%; p = 0.04). Reduced visual information processing skills were more common in Indigenous children (reduced visual motor integration [Indigenous, 28%; non-Indigenous, 16%; p < 0.01] and slower rapid automatized naming [Indigenous, 67%; non-Indigenous, 59%; p = 0.04]). The prevalence of visual impairment (reduced visual acuity) and color vision deficiency was similar between groups. Conclusions Indigenous children have less refractive error and strabismus than their non-Indigenous peers. However, CI and reduced visual information processing skills were more common in this group. Given that vision screenings primarily target visual acuity assessment and strabismus detection, this is an important finding as many Indigenous children with CI and reduced visual information processing may be missed. Emphasis should be placed on identifying children with CI and reduced visual information processing given the potential effect of these conditions on school performance

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Intact function of working memory (WM) is essential for children and adults to cope with every day life. Children with deficits in WM mechanisms have learning difficulties that are often accompanied by behavioral problems. The neural processes subserving WM, and brain structures underlying this system, continue to develop during childhood till adolescence and young adulthood. With functional magnetic resonance imaging (fMRI) it is possible to investigate the organization and development of WM. The present thesis aimed to investigate, using behavioral and neuroimaging methods, whether mnemonic processing of spatial and nonspatial visual information is segregated in the developing and mature human brain. A further aim in this research was to investigate the organization and development of audiospatial and visuospatial information processing in WM. The behavioral results showed that spatial and nonspatial visual WM processing is segregated in the adult brain. The fMRI result in children suggested that memory load related processing of spatial and nonspatial visual information engages common cortical networks, whereas selective attention to either type of stimuli recruits partially segregated areas in the frontal, parietal and occipital cortices. Deactivation mechanisms that are important in the performance of WM tasks in adults are already operational in healthy school-aged children. Electrophysiological evidence suggested segregated mnemonic processing of visual and auditory location information. The results of the development of audiospatial and visuospatial WM demonstrate that WM performance improves with age, suggesting functional maturation of underlying cognitive processes and brain areas. The development of the performance of spatial WM tasks follows a different time course in boys and girls indicating a larger degree of immaturity in the male than female WM systems. Furthermore, the differences in mastering auditory and visual WM tasks may indicate that visual WM reaches functional maturity earlier than the corresponding auditory system. Spatial WM deficits may underlie some learning difficulties and behavioral problems related to impulsivity, difficulties in concentration, and hyperactivity. Alternatively, anxiety or depressive symptoms may affect WM function and the ability to concentrate, being thus the primary cause of poor academic achievement in children.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

How the brain maintains perceptual continuity across eye movements that yield discontinuous snapshots of the world is still poorly understood. In this study, we adapted a framework from the dual-task paradigm, well suited to reveal bottlenecks in mental processing, to study how information is processed across sequential saccades. The pattern of RTs allowed us to distinguish among three forms of trans-saccadic processing (no trans-saccadic processing, trans-saccadic visual processing and trans-saccadic visual processing and saccade planning models). Using a cued double-step saccade task, we show that even though saccade execution is a processing bottleneck, limiting access to incoming visual information, partial visual and motor processing that occur prior to saccade execution is used to guide the next eye movement. These results provide insights into how the oculomotor system is designed to process information across multiple fixations that occur during natural scanning.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Visual information is difficult to search and interpret when the density of the displayed information is high or the layout is chaotic. Visual information that exhibits such properties is generally referred to as being "cluttered." Clutter should be avoided in information visualizations and interface design in general because it can severely degrade task performance. Although previous studies have identified computable correlates of clutter (such as local feature variance and edge density), understanding of why humans perceive some scenes as being more cluttered than others remains limited. Here, we explore an account of clutter that is inspired by findings from visual perception studies. Specifically, we test the hypothesis that the so-called "crowding" phenomenon is an important constituent of clutter. We constructed an algorithm to predict visual clutter in arbitrary images by estimating the perceptual impairment due to crowding. After verifying that this model can reproduce crowding data we tested whether it can also predict clutter. We found that its predictions correlate well with both subjective clutter assessments and search performance in cluttered scenes. These results suggest that crowding and clutter may indeed be closely related concepts and suggest avenues for further research.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The human motor system is remarkably proficient in the online control of visually guided movements, adjusting to changes in the visual scene within 100 ms [1-3]. This is achieved through a set of highly automatic processes [4] translating visual information into representations suitable for motor control [5, 6]. For this to be accomplished, visual information pertaining to target and hand need to be identified and linked to the appropriate internal representations during the movement. Meanwhile, other visual information must be filtered out, which is especially demanding in visually cluttered natural environments. If selection of relevant sensory information for online control was achieved by visual attention, its limited capacity [7] would substantially constrain the efficiency of visuomotor feedback control. Here we demonstrate that both exogenously and endogenously cued attention facilitate the processing of visual target information [8], but not of visual hand information. Moreover, distracting visual information is more efficiently filtered out during the extraction of hand compared to target information. Our results therefore suggest the existence of a dedicated visuomotor binding mechanism that links the hand representation in visual and motor systems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We describe a form of amnesia, which we have called visual memory-deficit amnesia, that is caused by damage to areas of the visual system that store visual information. Because it is caused by a deficit in access to stored visual material and not by an impaired ability to encode or retrieve new material, it has the otherwise infrequent properties of a more severe retrograde than anterograde amnesia with no temporal gradient in the retrograde amnesia. Of the 11 cases of long-term visual memory loss found in the literature, all had amnesia extending beyond a loss of visual memory, often including a near total loss of pretraumatic episodic memory. Of the 6 cases in which both the severity of retrograde and anterograde amnesia and the temporal gradient of the retrograde amnesia were noted, 4 had a more severe retrograde amnesia with no temporal gradient and 2 had a less severe retrograde amnesia with a temporal gradient.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent memories are generally recalled from a first-person perspective whereas older memories are often recalled from a third-person perspective. We investigated how repeated retrieval affects the availability of visual information, and whether it could explain the observed shift in perspective with time. In Experiment 1, participants performed mini-events and nominated memories of recent autobiographical events in response to cue words. Next, they described their memory for each event and rated its phenomenological characteristics. Over the following three weeks, they repeatedly retrieved half of the mini-event and cue-word memories. No instructions were given about how to retrieve the memories. In Experiment 2, participants were asked to adopt either a first- or third-person perspective during retrieval. One month later, participants retrieved all of the memories and again provided phenomenology ratings. When first-person visual details from the event were repeatedly retrieved, this information was retained better and the shift in perspective was slowed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Rapid orientating movements of the eyes are believed to be controlled ballistically. The mechanism underlying this control is thought to involve a comparison between the desired displacement of the eye and an estimate of its actual position (obtained from the integration of the eye velocity signal). This study shows, however, that under certain circumstances fast gaze movements may be controlled quite differently and may involve mechanisms which use visual information to guide movements prospectively. Subjects were required to make large gaze shifts in yaw towards a target whose location and motion were unknown prior to movement onset. Six of those tested demonstrated remarkable accuracy when making gaze shifts towards a target that appeared during their ongoing movement. In fact their level of accuracy was not significantly different from that shown when they performed a 'remembered' gaze shift to a known stationary target (F-3,F-15 = 0.15, p > 0.05). The lack of a stereotypical relationship between the skew of the gaze velocity profile and movement duration indicates that on-line modifications were being made. It is suggested that a fast route from the retina to the superior colliculus could account for this behaviour and that models of oculomotor control need to be updated.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent studies suggested that the control of hand movements in catching involves continuous vision-based adjustments. More insight into these adjustments may be gained by examining the effects of occluding different parts of the ball trajectory. Here, we examined the effects of such occlusion on lateral hand movements when catching balls approaching from different directions, with the occlusion conditions presented in blocks or in randomized order. The analyses showed that late occlusion only had an effect during the blocked presentation, and early occlusion only during the randomized presentation. During the randomized presentation movement biases were more leftward if the preceding trial was an early occlusion trial. The effect of early occlusion during the randomized presentation suggests that the observed leftward movement bias relates to the rightward visual acceleration inherent to the ball trajectories used, while its absence during the blocked presentation seems to reflect trial-by-trial adaptations in the visuomotor gain, reminiscent of dynamic gain control in the smooth pursuit system. The movement biases during the late occlusion block were interpreted in terms of an incomplete motion extrapolation--a reduction of the velocity gain--caused by the fact that participants never saw the to-be-extrapolated part of the ball trajectory. These results underscore that continuous movement adjustments for catching do not only depend on visual information, but also on visuomotor adaptations based on non-visual information.