867 resultados para Visual Perception
Resumo:
PURPOSE: To examine the basis of previous findings of an association between indices of driving safety and visual motion sensitivity and to examine whether this association could be explained by low-level changes in visual function. METHODS: 36 visually normal participants (aged 19 – 80 years), completed a battery of standard vision tests including visual acuity, contrast sensitivity and automated visual fields. and two tests of motion perception including sensitivity for movement of a drifting Gabor stimulus, and sensitivity for displacement in a random-dot kinematogram (Dmin). Participants also completed a hazard perception test (HPT) which measured participants’ response times to hazards embedded in video recordings of real world driving which has been shown to be linked to crash risk. RESULTS: Dmin for the random-dot stimulus ranged from -0.88 to -0.12 log minutes of arc, and the minimum drift rate for the Gabor stimulus ranged from 0.01 to 0.35 cycles per second. Both measures of motion sensitivity significantly predicted response times on the HPT. In addition, while the relationship involving the HPT and motion sensitivity for the random-dot kinematogram was partially explained by the other visual function measures, the relationship with sensitivity for detection of the drifting Gabor stimulus remained significant even after controlling for these variables. CONCLUSION: These findings suggest that motion perception plays an important role in the visual perception of driving-relevant hazards independent of other areas of visual function and should be further explored as a predictive test of driving safety. Future research should explore the causes of reduced motion perception in order to develop better interventions to improve road safety.
Resumo:
The integration of separate, yet complimentary, cortical pathways appears to play a role in visual perception and action when intercepting objects. The ventral system is responsible for object recognition and identification, while the dorsal system facilitates continuous regulation of action. This dual-system model implies that empirically manipulating different visual information sources during performance of an interceptive action might lead to the emergence of distinct gaze and movement pattern profiles. To test this idea, we recorded hand kinematics and eye movements of participants as they attempted to catch balls projected from a novel apparatus that synchronised or de-synchronised accompanying video images of a throwing action and ball trajectory. Results revealed that ball catching performance was less successful when patterns of hand movements and gaze behaviours were constrained by the absence of advanced perceptual information from the thrower's actions. Under these task constraints, participants began tracking the ball later, followed less of its trajectory, and adapted their actions by initiating movements later and moving the hand faster. There were no performance differences when the throwing action image and ball speed were synchronised or de-synchronised since hand movements were closely linked to information from ball trajectory. Results are interpreted relative to the two-visual system hypothesis, demonstrating that accurate interception requires integration of advanced visual information from kinematics of the throwing action and from ball flight trajectory.
Resumo:
Successful interaction with the world depends on accurate perception of the timing of external events. Neurons at early stages of the primate visual system represent time-varying stimuli with high precision. However, it is unknown whether this temporal fidelity is maintained in the prefrontal cortex, where changes in neuronal activity generally correlate with changes in perception. One reason to suspect that it is not maintained is that humans experience surprisingly large fluctuations in the perception of time. To investigate the neuronal correlates of time perception, we recorded from neurons in the prefrontal cortex and midbrain of monkeys performing a temporal-discrimination task. Visual time intervals were presented at a timescale relevant to natural behavior (<500 ms). At this brief timescale, neuronal adaptation--time-dependent changes in the size of successive responses--occurs. We found that visual activity fluctuated with timing judgments in the prefrontal cortex but not in comparable midbrain areas. Surprisingly, only response strength, not timing, predicted task performance. Intervals perceived as longer were associated with larger visual responses and shorter intervals with smaller responses, matching the dynamics of adaptation. These results suggest that the magnitude of prefrontal activity may be read out to provide temporal information that contributes to judging the passage of time.
Resumo:
This thesis explores the debate and issues regarding the status of visual ;,iferellces in the optical writings of Rene Descartes, George Berkeley and James 1. Gibson. It gathers arguments from across their works and synthesizes an account of visual depthperception that accurately reflects the larger, metaphysical implications of their philosophical theories. Chapters 1 and 2 address the Cartesian and Berkelean theories of depth-perception, respectively. For Descartes and Berkeley the debate can be put in the following way: How is it possible that we experience objects as appearing outside of us, at various distances, if objects appear inside of us, in the representations of the individual's mind? Thus, the Descartes-Berkeley component of the debate takes place exclusively within a representationalist setting. Representational theories of depthperception are rooted in the scientific discovery that objects project a merely twodimensional patchwork of forms on the retina. I call this the "flat image" problem. This poses the problem of depth in terms of a difference between two- and three-dimensional orders (i.e., a gap to be bridged by one inferential procedure or another). Chapter 3 addresses Gibson's ecological response to the debate. Gibson argues that the perceiver cannot be flattened out into a passive, two-dimensional sensory surface. Perception is possible precisely because the body and the environment already have depth. Accordingly, the problem cannot be reduced to a gap between two- and threedimensional givens, a gap crossed with a projective geometry. The crucial difference is not one of a dimensional degree. Chapter 3 explores this theme and attempts to excavate the empirical and philosophical suppositions that lead Descartes and Berkeley to their respective theories of indirect perception. Gibson argues that the notion of visual inference, which is necessary to substantiate representational theories of indirect perception, is highly problematic. To elucidate this point, the thesis steps into the representationalist tradition, in order to show that problems that arise within it demand a tum toward Gibson's information-based doctrine of ecological specificity (which is to say, the theory of direct perception). Chapter 3 concludes with a careful examination of Gibsonian affordallces as the sole objects of direct perceptual experience. The final section provides an account of affordances that locates the moving, perceiving body at the heart of the experience of depth; an experience which emerges in the dynamical structures that cross the body and the world.
Resumo:
Les stimuli naturels projetés sur nos rétines nous fournissent de l’information visuelle riche. Cette information varie le long de propriétés de « bas niveau » telles que la luminance, le contraste, et les fréquences spatiales. Alors qu’une partie de cette information atteint notre conscience, une autre partie est traitée dans le cerveau sans que nous en soyons conscients. Les propriétés de l’information influençant l’activité cérébrale et le comportement de manière consciente versus non-consciente demeurent toutefois peu connues. Cette question a été examinée dans les deux derniers articles de la présente thèse, en exploitant les techniques psychophysiques développées dans les deux premiers articles. Le premier article présente la boîte à outils SHINE (spectrum, histogram, and intensity normalization and equalization), développée afin de permettre le contrôle des propriétés de bas niveau de l'image dans MATLAB. Le deuxième article décrit et valide la technique dite des bulles fréquentielles, qui a été utilisée tout au long des études de cette thèse pour révéler les fréquences spatiales utilisées dans diverses tâches de perception des visages. Cette technique offre les avantages d’une haute résolution au niveau des fréquences spatiales ainsi que d’un faible biais expérimental. Le troisième et le quatrième article portent sur le traitement des fréquences spatiales en fonction de la conscience. Dans le premier cas, la méthode des bulles fréquentielles a été utilisée avec l'amorçage par répétition masquée dans le but d’identifier les fréquences spatiales corrélées avec les réponses comportementales des observateurs lors de la perception du genre de visages présentés de façon consciente versus non-consciente. Les résultats montrent que les mêmes fréquences spatiales influencent de façon significative les temps de réponse dans les deux conditions de conscience, mais dans des sens opposés. Dans le dernier article, la méthode des bulles fréquentielles a été combinée à des enregistrements intracrâniens et au Continuous Flash Suppression (Tsuchiya & Koch, 2005), dans le but de cartographier les fréquences spatiales qui modulent l'activation de structures spécifiques du cerveau (l'insula et l'amygdale) lors de la perception consciente versus non-consciente des expressions faciales émotionnelles. Dans les deux régions, les résultats montrent que la perception non-consciente s'effectue plus rapidement et s’appuie davantage sur les basses fréquences spatiales que la perception consciente. La contribution de cette thèse est donc double. D’une part, des contributions méthodologiques à la recherche en perception visuelle sont apportées par l'introduction de la boîte à outils SHINE ainsi que de la technique des bulles fréquentielles. D’autre part, des indications sur les « corrélats de la conscience » sont fournies à l’aide de deux approches différentes.
Resumo:
In this paper we present a tutorial introduction to two important senses for biological and robotic systems — inertial and visual perception. We discuss the fundamentals of these two sensing modalities from a biological and an engineering perspective. Digital camera chips and micro-machined accelerometers and gyroscopes are now commodities, and when combined with today's available computing can provide robust estimates of self-motion as well 3D scene structure, without external infrastructure. We discuss the complementarity of these sensors, describe some fundamental approaches to fusing their outputs and survey the field.
Resumo:
This work aims to promote integrity in autonomous perceptual systems, with a focus on outdoor unmanned ground vehicles equipped with a camera and a 2D laser range finder. A method to check for inconsistencies between the data provided by these two heterogeneous sensors is proposed and discussed. First, uncertainties in the estimated transformation between the laser and camera frames are evaluated and propagated up to the projection of the laser points onto the image. Then, for each pair of laser scan-camera image acquired, the information at corners of the laser scan is compared with the content of the image, resulting in a likelihood of correspondence. The result of this process is then used to validate segments of the laser scan that are found to be consistent with the image, while inconsistent segments are rejected. Experimental results illustrate how this technique can improve the reliability of perception in challenging environmental conditions, such as in the presence of airborne dust.
Resumo:
By virtue of its widespread afferent projections, perirhinal cortex is thought to bind polymodal information into abstract object-level representations. Consistent with this proposal, deficits in cross-modal integration have been reported after perirhinal lesions in nonhuman primates. It is therefore surprising that imaging studies of humans have not observed perirhinal activation during visual-tactile object matching. Critically, however, these studies did not differentiate between congruent and incongruent trials. This is important because successful integration can only occur when polymodal information indicates a single object (congruent) rather than different objects (incongruent). We scanned neurologically intact individuals using functional magnetic resonance imaging (fMRI) while they matched shapes. We found higher perirhinal activation bilaterally for cross-modal (visual-tactile) than unimodal (visual-visual or tactile-tactile) matching, but only when visual and tactile attributes were congruent. Our results demonstrate that the human perirhinal cortex is involved in cross-modal, visual-tactile, integration and, thus, indicate a functional homology between human and monkey perirhinal cortices.
Resumo:
To identify and categorize complex stimuli such as familiar objects or speech, the human brain integrates information that is abstracted at multiple levels from its sensory inputs. Using cross-modal priming for spoken words and sounds, this functional magnetic resonance imaging study identified 3 distinct classes of visuoauditory incongruency effects: visuoauditory incongruency effects were selective for 1) spoken words in the left superior temporal sulcus (STS), 2) environmental sounds in the left angular gyrus (AG), and 3) both words and sounds in the lateral and medial prefrontal cortices (IFS/mPFC). From a cognitive perspective, these incongruency effects suggest that prior visual information influences the neural processes underlying speech and sound recognition at multiple levels, with the STS being involved in phonological, AG in semantic, and mPFC/IFS in higher conceptual processing. In terms of neural mechanisms, effective connectivity analyses (dynamic causal modeling) suggest that these incongruency effects may emerge via greater bottom-up effects from early auditory regions to intermediate multisensory integration areas (i.e., STS and AG). This is consistent with a predictive coding perspective on hierarchical Bayesian inference in the cortex where the domain of the prediction error (phonological vs. semantic) determines its regional expression (middle temporal gyrus/STS vs. AG/intraparietal sulcus).
Resumo:
In visual search one tries to find the currently relevant item among other, irrelevant items. In the present study, visual search performance for complex objects (characters, faces, computer icons and words) was investigated, and the contribution of different stimulus properties, such as luminance contrast between characters and background, set size, stimulus size, colour contrast, spatial frequency, and stimulus layout were investigated. Subjects were required to search for a target object among distracter objects in two-dimensional stimulus arrays. The outcome measure was threshold search time, that is, the presentation duration of the stimulus array required by the subject to find the target with a certain probability. It reflects the time used for visual processing separated from the time used for decision making and manual reactions. The duration of stimulus presentation was controlled by an adaptive staircase method. The number and duration of eye fixations, saccade amplitude, and perceptual span, i.e., the number of items that can be processed during a single fixation, were measured. It was found that search performance was correlated with the number of fixations needed to find the target. Search time and the number of fixations increased with increasing stimulus set size. On the other hand, several complex objects could be processed during a single fixation, i.e., within the perceptual span. Search time and the number of fixations depended on object type as well as luminance contrast. The size of the perceptual span was smaller for more complex objects, and decreased with decreasing luminance contrast within object type, especially for very low contrasts. In addition, the size and shape of perceptual span explained the changes in search performance for different stimulus layouts in word search. Perceptual span was scale invariant for a 16-fold range of stimulus sizes, i.e., the number of items processed during a single fixation was independent of retinal stimulus size or viewing distance. It is suggested that saccadic visual search consists of both serial (eye movements) and parallel (processing within perceptual span) components, and that the size of the perceptual span may explain the effectiveness of saccadic search in different stimulus conditions. Further, low-level visual factors, such as the anatomical structure of the retina, peripheral stimulus visibility and resolution requirements for the identification of different object types are proposed to constrain the size of the perceptual span, and thus, limit visual search performance. Similar methods were used in a clinical study to characterise the visual search performance and eye movements of neurological patients with chronic solvent-induced encephalopathy (CSE). In addition, the data about the effects of different stimulus properties on visual search in normal subjects were presented as simple practical guidelines, so that the limits of human visual perception could be taken into account in the design of user interfaces.
Resumo:
The neural basis of visual perception can be understood only when the sequence of cortical activity underlying successful recognition is known. The early steps in this processing chain, from retina to the primary visual cortex, are highly local, and the perception of more complex shapes requires integration of the local information. In Study I of this thesis, the progression from local to global visual analysis was assessed by recording cortical magnetoencephalographic (MEG) responses to arrays of elements that either did or did not form global contours. The results demonstrated two spatially and temporally distinct stages of processing: The first, emerging 70 ms after stimulus onset around the calcarine sulcus, was sensitive to local features only, whereas the second, starting at 130 ms across the occipital and posterior parietal cortices, reflected the global configuration. To explore the links between cortical activity and visual recognition, Studies II III presented subjects with recognition tasks of varying levels of difficulty. The occipito-temporal responses from 150 ms onwards were closely linked to recognition performance, in contrast to the 100-ms mid-occipital responses. The averaged responses increased gradually as a function of recognition performance, and further analysis (Study III) showed the single response strengths to be graded as well. Study IV addressed the attention dependence of the different processing stages: Occipito-temporal responses peaking around 150 ms depended on the content of the visual field (faces vs. houses), whereas the later and more sustained activity was strongly modulated by the observers attention. Hemodynamic responses paralleled the pattern of the more sustained electrophysiological responses. Study V assessed the temporal processing capacity of the human object recognition system. Above sufficient luminance, contrast and size of the object, the processing speed was not limited by such low-level factors. Taken together, these studies demonstrate several distinct stages in the cortical activation sequence underlying the object recognition chain, reflecting the level of feature integration, difficulty of recognition, and direction of attention.
Resumo:
Visual information is difficult to search and interpret when the density of the displayed information is high or the layout is chaotic. Visual information that exhibits such properties is generally referred to as being "cluttered." Clutter should be avoided in information visualizations and interface design in general because it can severely degrade task performance. Although previous studies have identified computable correlates of clutter (such as local feature variance and edge density), understanding of why humans perceive some scenes as being more cluttered than others remains limited. Here, we explore an account of clutter that is inspired by findings from visual perception studies. Specifically, we test the hypothesis that the so-called "crowding" phenomenon is an important constituent of clutter. We constructed an algorithm to predict visual clutter in arbitrary images by estimating the perceptual impairment due to crowding. After verifying that this model can reproduce crowding data we tested whether it can also predict clutter. We found that its predictions correlate well with both subjective clutter assessments and search performance in cluttered scenes. These results suggest that crowding and clutter may indeed be closely related concepts and suggest avenues for further research.
Resumo:
It is important for practical application to design an effective and efficient metric for video quality. The most reliable way is by subjective evaluation. Thus, to design an objective metric by simulating human visual system (HVS) is quite reasonable and available. In this paper, the video quality assessment metric based on visual perception is proposed. Three-dimensional wavelet is utilized to decompose video and then extract features to mimic the multichannel structure of HVS. Spatio-temporal contrast sensitivity function (S-T CSF) is employed to weight coefficient obtained by three-dimensional wavelet to simulate nonlinearity feature of the human eyes. Perceptual threshold is exploited to obtain visual sensitive coefficients after S-T CSF filtered. Visual sensitive coefficients are normalized representation and then visual sensitive errors are calculated between reference and distorted video. Finally, temporal perceptual mechanism is applied to count values of video quality for reducing computational cost. Experimental results prove the proposed method outperforms the most existing methods and is comparable to LHS and PVQM.