837 resultados para Scene perception
Resumo:
We investigated the effect of image size on saccade amplitudes. First, in a meta-analysis, relevant results from previous scene perception studies are summarised, suggesting the possibility of a linear relationship between mean saccade amplitude and image size. Forty-eight observers viewed 96 colour scene images scaled to four different sizes, while their eye movements were recorded. Mean and median saccade amplitudes were found to be directly proportional to image size, while the mode of the distribution lay in the range of very short saccades. However, saccade amplitudes expressed as percentages of image size were not constant over the different image sizes; on smaller stimulus images, the relative saccades were found to be larger, and vice versa. In sum, and as far as mean and median saccade amplitudes are concerned, the size of stimulus images is the dominant factor. Other factors, such as image properties, viewing task, or measurement equipment, are only of subordinate importance. Thus, the role of stimulus size has to be reconsidered, in theoretical as well as methodological terms.
Resumo:
Color has an unresolved role in the rapid process of natural scene. The temporal changes of the color effect might partly account for the debates. Besides, the distinction of localized and unlocalized information has not been addressed directly in these color studies. Here we present two experiments that investigate whether color contributes to categorization in a briefly flashed natural image and also whether it is mediated by time and low-level information. By controlling the interval between target and mask stimuli, Experiment 1 tested the hypothesis that colors could facilitate in the early stage of scene perception and the effect would decay in later processing. Experiment 2 examined how the randomization of local phase information influenced the color’s advantage over gray. Together, the results suggest that color does enhance natural scene categorization at short exposure time. Furthermore, results imply that effect of color is stable between 12 and120ms, and is not accounted by showing the structures organized by localized information. Therefore,we concluded that color always make effect in the process of rapid scene categorization, and do not depend on localized information. Thus, the present study is an attempt to fill the gap in previous research; its results is an contribution to deeper understanding of the role of color in natural scene perception.
Resumo:
How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.
Resumo:
The goal of this study is to identify cues for the cognitive process of attention in ancient Greek art, aiming to find confirmation of its possible use by ancient Greek audiences and artists. Evidence of cues that trigger attention’s psychological dispositions was searched through content analysis of image reproductions of ancient Greek sculpture and fine vase painting from the archaic to the Hellenistic period - ca. 7th -1st cent. BC. Through this analysis, it was possible to observe the presence of cues that trigger orientation to the work of art (i.e. amplification, contrast, emotional salience, simplification, symmetry), of a cue that triggers a disseminate attention to the parts of the work (i.e. distribution of elements) and of cues that activate selective attention to specific elements in the work of art (i.e. contrast of elements, salient color, central positioning of elements, composition regarding the flow of elements and significant objects). Results support the universality of those dispositions, probably connected with basic competencies that are hard-wired in the nervous system and in the cognitive processes.
Resumo:
The human visual system is adept at detecting and encoding statistical regularities in its spatio-temporal environment. Here we report an unexpected failure of this ability in the context of perceiving inconsistencies in illumination distributions across a scene. Contrary to predictions from previous studies [Enns and Rensink, 1990; Sun and Perona, 1996a, 1996b, 1997], we find that the visual system displays a remarkable lack of sensitivity to illumination inconsistencies, both in experimental stimuli and in images of real scenes. Our results allow us to draw inferences regarding how the visual system encodes illumination distributions across scenes. Specifically, they suggest that the visual system does not verify the global consistency of locally derived estimates of illumination direction.
Resumo:
Do we view the world differently if it is described to us in figurative rather than literal terms? An answer to this question would reveal something about both the conceptual representation of figurative language and the scope of top-down influences oil scene perception. Previous work has shown that participants will look longer at a path region of a picture when it is described with a type of figurative language called fictive motion (The road goes through the desert) rather than without (The road is in the desert). The current experiment provided evidence that such fictive motion descriptions affect eye movements by evoking mental representations of motion. If participants heard contextual information that would hinder actual motion, it influenced how they viewed a picture when it was described with fictive motion. Inspection times and eye movements scanning along the path increased during fictive motion descriptions when the terrain was first described as difficult (The desert is hilly) as compared to easy (The desert is flat); there were no such effects for descriptions without fictive motion. It is argued that fictive motion evokes a mental simulation of motion that is immediately integrated with visual processing, and hence figurative language can have a distinct effect on perception. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
An increasing number of neuroscience experiments are using virtual reality to provide a more immersive and less artificial experimental environment. This is particularly useful to navigation and three-dimensional scene perception experiments. Such experiments require accurate real-time tracking of the observer's head in order to render the virtual scene. Here, we present data on the accuracy of a commonly used six degrees of freedom tracker (Intersense IS900) when it is moved in ways typical of virtual reality applications. We compared the reported location of the tracker with its location computed by an optical tracking method. When the tracker was stationary, the root mean square error in spatial accuracy was 0.64 mm. However, we found that errors increased over ten-fold (up to 17 mm) when the tracker moved at speeds common in virtual reality applications. We demonstrate that the errors we report here are predominantly due to inaccuracies of the IS900 system rather than the optical tracking against which it was compared. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
When we actively explore the visual environment, our gaze preferentially selects regions characterized by high contrast and high density of edges, suggesting that the guidance of eye movements during visual exploration is driven to a significant degree by perceptual characteristics of a scene. Converging findings suggest that the selection of the visual target for the upcoming saccade critically depends on a covert shift of spatial attention. However, it is unclear whether attention selects the location of the next fixation uniquely on the basis of global scene structure or additionally on local perceptual information. To investigate the role of spatial attention in scene processing, we examined eye fixation patterns of patients with spatial neglect during unconstrained exploration of natural images and compared these to healthy and brain-injured control participants. We computed luminance, colour, contrast, and edge information contained in image patches surrounding each fixation and evaluated whether they differed from randomly selected image patches. At the global level, neglect patients showed the characteristic ipsilesional shift of the distribution of their fixations. At the local level, patients with neglect and control participants fixated image regions in ipsilesional space that were closely similar with respect to their local feature content. In contrast, when directing their gaze to contralesional (impaired) space neglect patients fixated regions of significantly higher local luminance and lower edge content than controls. These results suggest that intact spatial attention is necessary for the active sampling of local feature content during scene perception.
Resumo:
Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.
Resumo:
In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges and junctions may provide a 3D model of the scene but it will not inform about the actual "size" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, this is computationally complex due to the difficulty of the object recognition process. Here we propose a source of information for absolute depth estimation that does not rely on specific objects: we introduce a procedure for absolute depth estimation based on the recognition of the whole scene. The shape of the space of the scene and the structures present in the scene are strongly related to the scale of observation. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene, and therefore its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.
Resumo:
Over the last decade, television screens and display monitors have increased in size considerably, but has this improved our televisual experience? Our working hypothesis was that the audiences adopt a general strategy that “bigger is better.” However, as our visual perceptions do not tap directly into basic retinal image properties such as retinal image size (C. A. Burbeck, 1987), we wondered whether object size itself might be an important factor. To test this, we needed a task that would tap into the subjective experiences of participants watching a movie on different-sized displays with the same retinal subtense. Our participants used a line bisection task to self-report their level of “presence” (i.e., their involvement with the movie) at several target locations that were probed in a 45-min section of the movie “The Good, The Bad, and The Ugly.” Measures of pupil dilation and reaction time to the probes were also obtained. In Experiment 1, we found that subjective ratings of presence increased with physical screen size, supporting our hypothesis. Face scenes also produced higher presence scores than landscape scenes for both screen sizes. In Experiment 2, reaction time and pupil dilation results showed the same trends as the presence ratings and pupil dilation correlated with presence ratings, providing some validation of the method. Overall, the results suggest that real-time measures of subjective presence might be a valuable tool for measuring audience experience for different types of (i) display and (ii) audiovisual material.
Resumo:
This work aims to contribute to the reliability and integrity of perceptual systems of unmanned ground vehicles (UGV). A method is proposed to evaluate the quality of sensor data prior to its use in a perception system by utilising a quality metric applied to heterogeneous sensor data such as visual and infrared camera images. The concept is illustrated specifically with sensor data that is evaluated prior to the use of the data in a standard SIFT feature extraction and matching technique. The method is then evaluated using various experimental data sets that were collected from a UGV in challenging environmental conditions, represented by the presence of airborne dust and smoke. In the first series of experiments, a motionless vehicle is observing a ’reference’ scene, then the method is extended to the case of a moving vehicle by compensating for its motion. This paper shows that it is possible to anticipate degradation of a perception algorithm by evaluating the input data prior to any actual execution of the algorithm.
Resumo:
Semantic perception and object labeling are key requirements for robots interacting with objects on a higher level. Symbolic annotation of objects allows the usage of planning algorithms for object interaction, for instance in a typical fetchand-carry scenario. In current research, perception is usually based on 3D scene reconstruction and geometric model matching, where trained features are matched with a 3D sample point cloud. In this work we propose a semantic perception method which is based on spatio-semantic features. These features are defined in a natural, symbolic way, such as geometry and spatial relation. In contrast to point-based model matching methods, a spatial ontology is used where objects are rather described how they "look like", similar to how a human would described unknown objects to another person. A fuzzy based reasoning approach matches perceivable features with a spatial ontology of the objects. The approach provides a method which is able to deal with senor noise and occlusions. Another advantage is that no training phase is needed in order to learn object features. The use-case of the proposed method is the detection of soil sample containers in an outdoor environment which have to be collected by a mobile robot. The approach is verified using real world experiments.