839 resultados para Scene perception
Resumo:
We investigated the effect of image size on saccade amplitudes. First, in a meta-analysis, relevant results from previous scene perception studies are summarised, suggesting the possibility of a linear relationship between mean saccade amplitude and image size. Forty-eight observers viewed 96 colour scene images scaled to four different sizes, while their eye movements were recorded. Mean and median saccade amplitudes were found to be directly proportional to image size, while the mode of the distribution lay in the range of very short saccades. However, saccade amplitudes expressed as percentages of image size were not constant over the different image sizes; on smaller stimulus images, the relative saccades were found to be larger, and vice versa. In sum, and as far as mean and median saccade amplitudes are concerned, the size of stimulus images is the dominant factor. Other factors, such as image properties, viewing task, or measurement equipment, are only of subordinate importance. Thus, the role of stimulus size has to be reconsidered, in theoretical as well as methodological terms.
Resumo:
Previous functional MRI (fMRI) studies have associated anterior hippocampus with imagining and recalling scenes, imagining the future, recalling autobiographical memories and visual scene perception. We have observed that this typically involves the medial rather than the lateral portion of the anterior hippocampus. Here, we investigated which specific structures of the hippocampus underpin this observation. We had participants imagine novel scenes during fMRI scanning, as well as recall previously learned scenes from two different time periods (one week and 30 min prior to scanning), with analogous single object conditions as baselines. Using an extended segmentation protocol focussing on anterior hippocampus, we first investigated which substructures of the hippocampus respond to scenes, and found both imagination and recall of scenes to be associated with activity in presubiculum/parasubiculum, a region associated with spatial representation in rodents. Next, we compared imagining novel scenes to recall from one week or 30 min before scanning. We expected a strong response to imagining novel scenes and 1-week recall, as both involve constructing scene representations from elements stored across cortex. By contrast, we expected a weaker response to 30-min recall, as representations of these scenes had already been constructed but not yet consolidated. Both imagination and 1-week recall of scenes engaged anterior hippocampal structures (anterior subiculum and uncus respectively), indicating possible roles in scene construction. By contrast, 30-min recall of scenes elicited significantly less activation of anterior hippocampus but did engage posterior CA3. Together, these results elucidate the functions of different parts of the anterior hippocampus, a key brain area about which little is definitely known.
Resumo:
The human visual system is adept at detecting and encoding statistical regularities in its spatio-temporal environment. Here we report an unexpected failure of this ability in the context of perceiving inconsistencies in illumination distributions across a scene. Contrary to predictions from previous studies [Enns and Rensink, 1990; Sun and Perona, 1996a, 1996b, 1997], we find that the visual system displays a remarkable lack of sensitivity to illumination inconsistencies, both in experimental stimuli and in images of real scenes. Our results allow us to draw inferences regarding how the visual system encodes illumination distributions across scenes. Specifically, they suggest that the visual system does not verify the global consistency of locally derived estimates of illumination direction.
Resumo:
Do we view the world differently if it is described to us in figurative rather than literal terms? An answer to this question would reveal something about both the conceptual representation of figurative language and the scope of top-down influences oil scene perception. Previous work has shown that participants will look longer at a path region of a picture when it is described with a type of figurative language called fictive motion (The road goes through the desert) rather than without (The road is in the desert). The current experiment provided evidence that such fictive motion descriptions affect eye movements by evoking mental representations of motion. If participants heard contextual information that would hinder actual motion, it influenced how they viewed a picture when it was described with fictive motion. Inspection times and eye movements scanning along the path increased during fictive motion descriptions when the terrain was first described as difficult (The desert is hilly) as compared to easy (The desert is flat); there were no such effects for descriptions without fictive motion. It is argued that fictive motion evokes a mental simulation of motion that is immediately integrated with visual processing, and hence figurative language can have a distinct effect on perception. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
An increasing number of neuroscience experiments are using virtual reality to provide a more immersive and less artificial experimental environment. This is particularly useful to navigation and three-dimensional scene perception experiments. Such experiments require accurate real-time tracking of the observer's head in order to render the virtual scene. Here, we present data on the accuracy of a commonly used six degrees of freedom tracker (Intersense IS900) when it is moved in ways typical of virtual reality applications. We compared the reported location of the tracker with its location computed by an optical tracking method. When the tracker was stationary, the root mean square error in spatial accuracy was 0.64 mm. However, we found that errors increased over ten-fold (up to 17 mm) when the tracker moved at speeds common in virtual reality applications. We demonstrate that the errors we report here are predominantly due to inaccuracies of the IS900 system rather than the optical tracking against which it was compared. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
When we actively explore the visual environment, our gaze preferentially selects regions characterized by high contrast and high density of edges, suggesting that the guidance of eye movements during visual exploration is driven to a significant degree by perceptual characteristics of a scene. Converging findings suggest that the selection of the visual target for the upcoming saccade critically depends on a covert shift of spatial attention. However, it is unclear whether attention selects the location of the next fixation uniquely on the basis of global scene structure or additionally on local perceptual information. To investigate the role of spatial attention in scene processing, we examined eye fixation patterns of patients with spatial neglect during unconstrained exploration of natural images and compared these to healthy and brain-injured control participants. We computed luminance, colour, contrast, and edge information contained in image patches surrounding each fixation and evaluated whether they differed from randomly selected image patches. At the global level, neglect patients showed the characteristic ipsilesional shift of the distribution of their fixations. At the local level, patients with neglect and control participants fixated image regions in ipsilesional space that were closely similar with respect to their local feature content. In contrast, when directing their gaze to contralesional (impaired) space neglect patients fixated regions of significantly higher local luminance and lower edge content than controls. These results suggest that intact spatial attention is necessary for the active sampling of local feature content during scene perception.
Resumo:
In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges and junctions may provide a 3D model of the scene but it will not inform about the actual "size" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, this is computationally complex due to the difficulty of the object recognition process. Here we propose a source of information for absolute depth estimation that does not rely on specific objects: we introduce a procedure for absolute depth estimation based on the recognition of the whole scene. The shape of the space of the scene and the structures present in the scene are strongly related to the scale of observation. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene, and therefore its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.
Resumo:
Over the last decade, television screens and display monitors have increased in size considerably, but has this improved our televisual experience? Our working hypothesis was that the audiences adopt a general strategy that “bigger is better.” However, as our visual perceptions do not tap directly into basic retinal image properties such as retinal image size (C. A. Burbeck, 1987), we wondered whether object size itself might be an important factor. To test this, we needed a task that would tap into the subjective experiences of participants watching a movie on different-sized displays with the same retinal subtense. Our participants used a line bisection task to self-report their level of “presence” (i.e., their involvement with the movie) at several target locations that were probed in a 45-min section of the movie “The Good, The Bad, and The Ugly.” Measures of pupil dilation and reaction time to the probes were also obtained. In Experiment 1, we found that subjective ratings of presence increased with physical screen size, supporting our hypothesis. Face scenes also produced higher presence scores than landscape scenes for both screen sizes. In Experiment 2, reaction time and pupil dilation results showed the same trends as the presence ratings and pupil dilation correlated with presence ratings, providing some validation of the method. Overall, the results suggest that real-time measures of subjective presence might be a valuable tool for measuring audience experience for different types of (i) display and (ii) audiovisual material.
Resumo:
Studies concerning the processing of natural scenes using eye movement equipment have revealed that observers retain surprisingly little information from one fixation to the next. Other studies, in which fixation remained constant while elements within the scene were changed, have shown that, even without refixation, objects within a scene are surprisingly poorly represented. Although this effect has been studied in some detail in static scenes, there has been relatively little work on scenes as we would normally experience them, namely dynamic and ever changing. This paper describes a comparable form of change blindness in dynamic scenes, in which detection is performed in the presence of simulated observer motion. The study also describes how change blindness is affected by the manner in which the observer interacts with the environment, by comparing detection performance of an observer as the passenger or driver of a car. The experiments show that observer motion reduces the detection of orientation and location changes, and that the task of driving causes a concentration of object analysis on or near the line of motion, relative to passive viewing of the same scene.
Resumo:
Tone Mapping is the problem of compressing the range of a High-Dynamic Range image so that it can be displayed in a Low-Dynamic Range screen, without losing or introducing novel details: The final image should produce in the observer a sensation as close as possible to the perception produced by the real-world scene. We propose a tone mapping operator with two stages. The first stage is a global method that implements visual adaptation, based on experiments on human perception, in particular we point out the importance of cone saturation. The second stage performs local contrast enhancement, based on a variational model inspired by color vision phenomenology. We evaluate this method with a metric validated by psychophysical experiments and, in terms of this metric, our method compares very well with the state of the art.
Resumo:
Brightness judgments are a key part of the primate brain's visual analysis of the environment. There is general consensus that the perceived brightness of an image region is based not only on its actual luminance, but also on the photometric structure of its neighborhood. However, it is unclear precisely how a region's context influences its perceived brightness. Recent research has suggested that brightness estimation may be based on a sophisticated analysis of scene layout in terms of transparency, illumination and shadows. This work has called into question the role of low-level mechanisms, such as lateral inhibition, as explanations for brightness phenomena. Here we describe experiments with displays for which low-level and high-level analyses make qualitatively different predictions, and with which we can quantitatively assess the trade-offs between low-level and high-level factors. We find that brightness percepts in these displays are governed by low-level stimulus properties, even when these percepts are inconsistent with higher-level interpretations of scene layout. These results point to the important role of low-level mechanisms in determining brightness percepts.
Resumo:
Humans perceive the content (gist) of a scene very rapidly within about 40 ms [Castelhano and Henderson, 2008 Journal of Experimental Psychology Human Perception and Performance 43(3) 660-675]. It has also been demonstrated that colours contribute to the perception of the gist of a scene if the colours are diagnostic for the distinction of scenes (Oliva and Schyns, 2000 Cognitive Psychology 41 176-210). We presented 320 coloured photographs of 2 diagnostic (mountains and coasts) and 2 nondiagnostic colour scenes (cities and rooms), 80 per category, in a masking paradigm. The mask consisted of randomly distributed colour patches. SOA was varied between 20 and 80 ms, in steps of 20 ms and subjects had to indicate the gist of the scene (4AFC). A control condition without masking was also included. In line with previous results we have found that the gist of nondiagnostic coloured scenes is extracted within 40 ms. However, if colour comes into play, the extraction of the scene gist is prolonged by about 20 ms. A possible reason for this outcome might be that nondiagnostic colour scenes are identified by their luminance components which are processed faster than the colour information, which in turn mediates the identification of diagnostic colour scenes
Resumo:
The characteristics of moving sound sources have strong implications on the listener's distance perception and the estimation of velocity. Modifications of the typical sound emissions as they are currently occurring due to the tendency towards electromobility have an impact on the pedestrian's safety in road traffic. Thus, investigations of the relevant cues for velocity and distance perception of moving sound sources are not only of interest for the psychoacoustic community, but also for several applications, like e.g. virtual reality, noise pollution and safety aspects of road traffic. This article describes a series of psychoacoustic experiments in this field. Dichotic and diotic stimuli of a set of real-life recordings taken from a passing passenger car and a motorcycle were presented to test subjects who in turn were asked to determine the velocity of the object and its minimal distance from the listener. The results of these psychoacoustic experiments show that the estimated velocity is strongly linked to the object's distance. Furthermore, it could be shown that binaural cues contribute significantly to the perception of velocity. In a further experiment, it was shown that - independently of the type of the vehicle - the main parameter for distance determination is the maximum sound pressure level at the listener's position. The article suggests a system architecture for the adequate consideration of moving sound sources in virtual auditory environments. Virtual environments can thus be used to investigate the influence of new vehicle powertrain concepts and the related sound emissions of these vehicles on the pedestrians' ability to estimate the distance and velocity of moving objects.
Resumo:
Quality assessment is a key factor for stereoscopic 3D video content as some observers are affected by visual discomfort in the eye when viewing 3D video, especially when combining positive and negative parallax with fast motion. In this paper, we propose techniques to assess objective quality related to motion and depth maps, which facilitate depth perception analysis. Subjective tests were carried out in order to understand the source of the problem. Motion is an important feature affecting 3D experience but also often the cause of visual discomfort. The automatic algorithm developed tries to quantify the impact on viewer experience when common cases of discomfort occur, such as high-motion sequences, scene changes with abrupt parallax changes, or complete absence of stereoscopy, with a goal of preventing the viewer from having a bad stereoscopic experience.
Resumo:
The primate visual motion system performs numerous functions essential for survival in a dynamic visual world. Prominent among these functions is the ability to recover and represent the trajectories of objects in a form that facilitates behavioral responses to those movements. The first step toward this goal, which consists of detecting the displacement of retinal image features, has been studied for many years in both psychophysical and neurobiological experiments. Evidence indicates that achievement of this step is computationally straightforward and occurs at the earliest cortical stage. The second step involves the selective integration of retinal motion signals according to the object of origin. Realization of this step is computationally demanding, as the solution is formally underconstrained. It must rely--by definition--upon utilization of retinal cues that are indicative of the spatial relationships within and between objects in the visual scene. Psychophysical experiments have documented this dependence and suggested mechanisms by which it may be achieved. Neurophysiological experiments have provided evidence for a neural substrate that may underlie this selective motion signal integration. Together they paint a coherent portrait of the means by which retinal image motion gives rise to our perceptual experience of moving objects.