985 resultados para Object vision


Relevância:

70.00% 70.00%

Publicador:

Resumo:

We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Rotations in depth are challenging for object vision because features can appear, disappear, be stretched or compressed. Yet we easily recognize objects across views. Are the underlying representations view invariant or dependent? This question has been intensely debated in human vision, but the neuronal representations remain poorly understood. Here, we show that for naturalistic objects, neurons in the monkey inferotemporal (IT) cortex undergo a dynamic transition in time, whereby they are initially sensitive to viewpoint and later encode view-invariant object identity. This transition depended on two aspects of object structure: it was strongest when objects foreshortened strongly across views and were similar to each other. View invariance in IT neurons was present even when objects were reduced to silhouettes, suggesting that it can arise through similarity between external contours of objects across views. Our results elucidate the viewpoint debate by showing that view invariance arises dynamically in IT neurons out of a representation that is initially view dependent.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Working memory is the process of actively maintaining a representation of information for a brief period of time so that it is available for use. In monkeys, visual working memory involves the concerted activity of a distributed neural system, including posterior areas in visual cortex and anterior areas in prefrontal cortex. Within visual cortex, ventral stream areas are selectively involved in object vision, whereas dorsal stream areas are selectively involved in spatial vision. This domain specificity appears to extend forward into prefrontal cortex, with ventrolateral areas involved mainly in working memory for objects and dorsolateral areas involved mainly in working memory for spatial locations. The organization of this distributed neural system for working memory in monkeys appears to be conserved in humans, though some differences between the two species exist. In humans, as compared with monkeys, areas specialized for object vision in the ventral stream have a more inferior location in temporal cortex, whereas areas specialized for spatial vision in the dorsal stream have a more superior location in parietal cortex. Displacement of both sets of visual areas away from the posterior perisylvian cortex may be related to the emergence of language over the course of brain evolution. Whereas areas specialized for object working memory in humans and monkeys are similarly located in ventrolateral prefrontal cortex, those specialized for spatial working memory occupy a more superior and posterior location within dorsal prefrontal cortex in humans than in monkeys. As in posterior cortex, this displacement in frontal cortex also may be related to the emergence of new areas to serve distinctively human cognitive abilities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Considerable evidence exists to support the hypothesis that the hippocampus and related medial temporal lobe structures are crucial for the encoding and storage of information in long-term memory. Few human imaging studies, however, have successfully shown signal intensity changes in these areas during encoding or retrieval. Using functional magnetic resonance imaging (fMRI), we studied normal human subjects while they performed a novel picture encoding task. High-speed echo-planar imaging techniques evaluated fMRI signal changes throughout the brain. During the encoding of novel pictures, statistically significant increases in fMRI signal were observed bilaterally in the posterior hippocampal formation and parahippocampal gyrus and in the lingual and fusiform gyri. To our knowledge, this experiment is the first fMRI study to show robust signal changes in the human hippocampal region. It also provides evidence that the encoding of novel, complex pictures depends upon an interaction between ventral cortical regions, specialized for object vision, and the hippocampal formation and parahippocampal gyrus, specialized for long-term memory.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Disruptive colouration is a visual camouflage composed of false edges and boundaries. Many disruptively camouflaged animals feature enhanced edges; light patches are surrounded by a lighter outline and/or a dark patches are surrounded by a darker outline. This camouflage is particularly common in amphibians, reptiles and lepidopterans. We explored the role that this pattern has in creating effective camouflage. In a visual search task utilising an ultra-large display area mimicking search tasks that might be found in nature, edge enhanced disruptive camouflage increases crypsis, even on substrates that do not provide an obvious visual match. Specifically, edge enhanced camouflage is effective on backgrounds both with and without shadows; i.e. this is not solely due to background matching of the dark edge enhancement element with the shadows. Furthermore, when the dark component of the edge enhancement is omitted the camouflage still provided better crypsis than control patterns without edge enhancement. This kind of edge enhancement improved camouflage on all background types. Lastly, we show that edge enhancement can create a perception of multiple surfaces. We conclude that edge enhancement increases the effectiveness of disruptive camouflage through mechanisms that may include the improved disruption of the object outline by implying pictorial relief.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose a method for learning specific object representations that can be applied (and reused) in visual detection and identification tasks. A machine learning technique called Cartesian Genetic Programming (CGP) is used to create these models based on a series of images. Our research investigates how manipulation actions might allow for the development of better visual models and therefore better robot vision. This paper describes how visual object representations can be learned and improved by performing object manipulation actions, such as, poke, push and pick-up with a humanoid robot. The improvement can be measured and allows for the robot to select and perform the `right' action, i.e. the action with the best possible improvement of the detector.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Tracking methods have the potential to retrieve the spatial location of project related entities such as personnel and equipment at construction sites, which can facilitate several construction management tasks. Existing tracking methods are mainly based on Radio Frequency (RF) technologies and thus require manual deployment of tags. On construction sites with numerous entities, tags installation, maintenance and decommissioning become an issue since it increases the cost and time needed to implement these tracking methods. To address these limitations, this paper proposes an alternate 3D tracking method based on vision. It operates by tracking the designated object in 2D video frames and correlating the tracking results from multiple pre-calibrated views using epipolar geometry. The methodology presented in this paper has been implemented and tested on videos taken in controlled experimental conditions. Results are compared with the actual 3D positions to validate its performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There are roughly two processing systems: (1) very fast gist vision of entire scenes, completely bottom-up and data driven, and (2) Focus-of-Attention (FoA) with sequential screening of specific image regions and objects. The latter system has to be sequential because unnormalised input objects must be matched against normalised templates of canonical object views stored in memory, which involves dynamic routing of features in the visual pathways.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Multi-scale representations of lines, edges and keypoints on the basis of simple, complex and end-stopped cells can be used for object categorisation and recognition (Rodrigues and du Buf, 2009 BioSystems 95 206-226). These representations are complemented by saliency maps of colour, texture, disparity and motion information, which also serve to model extremely fast gist vision in parallel with object segregation. We present a low-level geometry model based on a single type of self-adjusting grouping cell, with a circular array of dendrites connected to edge cells located at several angles.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Vision-based tracking sensors typically provide nonlinear measurements
of the targets Cartesian position and velocity state components. In this paper we derive linear measurements using an analytical measurement conversion technique which can be used with two (or more) vision sensors. We derive
linear measurements in the target’s Cartesian position and velocity components and we derive a robust version of a linear Kalman filter. We show that our linear robust filter significantly outperforms the extended Kalman Filter. Moreover, we prove that the state estimation error is bounded.