957 resultados para Visual cue integration
Resumo:
While searching for objects, we combine information from multiple visual modalities. Classical theories of visual search assume that features are processed independently prior to an integration stage. Based on this, one would predict that features that are equally discriminable in single feature search should remain so in conjunction search. We test this hypothesis by examining whether search accuracy in feature search predicts accuracy in conjunction search. Subjects searched for objects combining color and orientation or size; eye movements were recorded. Prior to the main experiment, we matched feature discriminability, making sure that in feature search, 70% of saccades were likely to go to the correct target stimulus. In contrast to this symmetric single feature discrimination performance, the conjunction search task showed an asymmetry in feature discrimination performance: In conjunction search, a similar percentage of saccades went to the correct color as in feature search but much less often to correct orientation or size. Therefore, accuracy in feature search is a good predictor of accuracy in conjunction search for color but not for size and orientation. We propose two explanations for the presence of such asymmetries in conjunction search: the use of conjunctively tuned channels and differential crowding effects for different features.
Resumo:
There is extensive agreement that attention may play a role in spatial stimmlus coding (Lu & Proctor, 1995). Some authors investigated the effects of spatial attention on the spatial coding by using spatial cueing procedure and spatial Stroop task. The finding was that the stroop effects were modulated by spatial cueing. Three hypotheses including attentional shift account, referential coding account, and event integration account were used to explain the modulation of spatial cueing over the spatial Stroop effects. In these previous studies, on validly cued trials, cue and target not only appeared at the same location, but also in the same object, which resulted in both location and object cued. Consequently, the modulation of spatial attentional cueing over spatial Stroop effects was confounded with the role of object-based attention. In the third chapter of this dissertation, using a modification of double rectangles cueing procedure developed by Egly, Driver and Rafal (1994) and spatial Stroop task employed by Lupiáñez and Funes (2005), separate effects of spatial attention and object-based attention on the location code of visual stimuli were investigated. Across four experiments, the combined results showed that spatial Stroop effects were modulate by object-based attention, but not by location-based attention. This pattern of results could be well explained by event integration account, but not by attentional shift account and referential coding account. In the fourth chapter, on the basis of the prior chapter, whether the modulation of attentional cueing on location code occurred at the stage of perceptual identification or response choice was investigated. The findings were that object-based attention modulated spatial Stroop effects and did not modulate the Simon effects, whereas spatial attention did not modulate Stroop and Simon effects. This pattern of results partially replicated the outcome of the previous chapter. The previous studies generally argued that the conflicts of spatial Stroop task and Simon task respectively occurred at at the stage of perceptual identification and response choice. Therefore, it is likely to conclude that the modulation of attention over spatial Stroop effect was mediated by object-based attention, and this modulation occurred at the stage perceptual identification. Considering that the previous studies mostly investigated the effects of attention captured by abrupt onset on the spatial Stroop effects, few studies investigated the effects of attention captured by offset cue on the spatial Stroop effects. The aim of the fifth chapter was to investigate the role of attention induced by offset and abrupt onset cue in the spatial Stroop task. These results showed that attention elicited by offset cue or abrupt onset cue modulated the spatial Stroop effects, which reconciled with event integration account.
Resumo:
A key question regarding primate visual motion perception is whether the motion of 2D patterns is recovered by tracking distinctive localizable features [Lorenceau and Gorea, 1989; Rubin and Hochstein, 1992] or by integrating ambiguous local motion estimates [Adelson and Movshon, 1982; Wilson and Kim, 1992]. For a two-grating plaid pattern, this translates to either tracking the grating intersections or to appropriately combining the motion estimates for each grating. Since both component and feature information are simultaneously available in any plaid pattern made of contrast defined gratings, it is unclear how to determine which of the two schemes is actually used to recover the plaid"s motion. To address this problem, we have designed a plaid pattern made with subjective, rather than contrast defined, gratings. The distinguishing characteristic of such a plaid pattern is that it contains no contrast defined intersections that may be tracked. We find that notwithstanding the absence of such features, observers can accurately recover the pattern velocity. Additionally we show that the hypothesis of tracking "illusory features" to estimate pattern motion does not stand up to experimental test. These results present direct evidence in support of the idea that calls for the integration of component motions over the one that mandates tracking localized features to recover 2D pattern motion. The localized features, we suggest, are used primarily as providers of grouping information - which component motion signals to integrate and which not to.
Resumo:
Recovering a volumetric model of a person, car, or other object of interest from a single snapshot would be useful for many computer graphics applications. 3D model estimation in general is hard, and currently requires active sensors, multiple views, or integration over time. For a known object class, however, 3D shape can be successfully inferred from a single snapshot. We present a method for generating a ``virtual visual hull''-- an estimate of the 3D shape of an object from a known class, given a single silhouette observed from an unknown viewpoint. For a given class, a large database of multi-view silhouette examples from calibrated, though possibly varied, camera rigs are collected. To infer a novel single view input silhouette's virtual visual hull, we search for 3D shapes in the database which are most consistent with the observed contour. The input is matched to component single views of the multi-view training examples. A set of viewpoint-aligned virtual views are generated from the visual hulls corresponding to these examples. The 3D shape estimate for the input is then found by interpolating between the contours of these aligned views. When the underlying shape is ambiguous given a single view silhouette, we produce multiple visual hull hypotheses; if a sequence of input images is available, a dynamic programming approach is applied to find the maximum likelihood path through the feasible hypotheses over time. We show results of our algorithm on real and synthetic images of people.
Resumo:
The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.
Resumo:
Li, Longzhuang, Liu, Yonghuai, Obregon, A., Weatherston, M. Visual Segmentation-Based Data Record Extraction From Web Documents. Proceedings of IEEE International Conference on Information Reuse and Integration, 2007, pp. 502-507. Sponsorship: IEEE
Resumo:
A neural theory is proposed in which visual search is accomplished by perceptual grouping and segregation, which occurs simultaneous across the visual field, and object recognition, which is restricted to a selected region of the field. The theory offers an alternative hypothesis to recently developed variations on Feature Integration Theory (Treisman, and Sato, 1991) and Guided Search Model (Wolfe, Cave, and Franzel, 1989). A neural architecture and search algorithm is specified that quantitatively explains a wide range of psychophysical search data (Wolfe, Cave, and Franzel, 1989; Cohen, and lvry, 1991; Mordkoff, Yantis, and Egeth, 1990; Treisman, and Sato, 1991).
Resumo:
Visual search data are given a unified quantitative explanation by a model of how spatial maps in the parietal cortex and object recognition categories in the inferotemporal cortex deploy attentional resources as they reciprocally interact with visual representations in the prestriate cortex. The model visual representations arc organized into multiple boundary and surface representations. Visual search in the model is initiated by organizing multiple items that lie within a given boundary or surface representation into a candidate search grouping. These items arc compared with object recognition categories to test for matches or mismatches. Mismatches can trigger deeper searches and recursive selection of new groupings until a target object io identified. This search model is algorithmically specified to quantitatively simulate search data using a single set of parameters, as well as to qualitatively explain a still larger data base, including data of Aks and Enns (1992), Bravo and Blake (1990), Chellazzi, Miller, Duncan, and Desimone (1993), Egeth, Viri, and Garbart (1984), Cohen and Ivry (1991), Enno and Rensink (1990), He and Nakayarna (1992), Humphreys, Quinlan, and Riddoch (1989), Mordkoff, Yantis, and Egeth (1990), Nakayama and Silverman (1986), Treisman and Gelade (1980), Treisman and Sato (1990), Wolfe, Cave, and Franzel (1989), and Wolfe and Friedman-Hill (1992). The model hereby provides an alternative to recent variations on the Feature Integration and Guided Search models, and grounds the analysis of visual search in neural models of preattentive vision, attentive object learning and categorization, and attentive spatial localization and orientation.
Resumo:
Recent memories are generally recalled from a first-person perspective whereas older memories are often recalled from a third-person perspective. We investigated how repeated retrieval affects the availability of visual information, and whether it could explain the observed shift in perspective with time. In Experiment 1, participants performed mini-events and nominated memories of recent autobiographical events in response to cue words. Next, they described their memory for each event and rated its phenomenological characteristics. Over the following three weeks, they repeatedly retrieved half of the mini-event and cue-word memories. No instructions were given about how to retrieve the memories. In Experiment 2, participants were asked to adopt either a first- or third-person perspective during retrieval. One month later, participants retrieved all of the memories and again provided phenomenology ratings. When first-person visual details from the event were repeatedly retrieved, this information was retained better and the shift in perspective was slowed.
Resumo:
Previous studies have attempted to identify sources of contextual information which can facilitate dual adaptation to two variants of a novel environment, which are normally prone to interference. The type of contextual information previously used can be grouped into two broad categories: that which is arbitrary to the motor system, such as a colour cue, and that which is based on an internal property of the motor system, such as a change in movement effector. The experiments reported here examined whether associating visuomotor rotations to visual targets and movements of different amplitude would serve as an appropriate source of contextual information to enable dual adaptation. The results indicated that visual target and movement amplitude is not a suitable source of contextual information to enable dual adaptation in our task. Interference was observed in groups who were exposed to opposing visuomotor rotations, or a visuomotor rotation and no rotation, both when the onset of the visuomotor rotations was sudden, or occurred gradually over the course of training. Furthermore, the pattern of interference indicated that the inability to dual adapt was a result of the generalisation of learning between the two visuomotor mappings associated with each of the visual target and movement amplitudes. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Rapid orientating movements of the eyes are believed to be controlled ballistically. The mechanism underlying this control is thought to involve a comparison between the desired displacement of the eye and an estimate of its actual position (obtained from the integration of the eye velocity signal). This study shows, however, that under certain circumstances fast gaze movements may be controlled quite differently and may involve mechanisms which use visual information to guide movements prospectively. Subjects were required to make large gaze shifts in yaw towards a target whose location and motion were unknown prior to movement onset. Six of those tested demonstrated remarkable accuracy when making gaze shifts towards a target that appeared during their ongoing movement. In fact their level of accuracy was not significantly different from that shown when they performed a 'remembered' gaze shift to a known stationary target (F-3,F-15 = 0.15, p > 0.05). The lack of a stereotypical relationship between the skew of the gaze velocity profile and movement duration indicates that on-line modifications were being made. It is suggested that a fast route from the retina to the superior colliculus could account for this behaviour and that models of oculomotor control need to be updated.
Resumo:
A rapidly increasing number of Web databases are now become accessible via
their HTML form-based query interfaces. Query result pages are dynamically generated
in response to user queries, which encode structured data and are displayed for human
use. Query result pages usually contain other types of information in addition to query
results, e.g., advertisements, navigation bar etc. The problem of extracting structured data
from query result pages is critical for web data integration applications, such as comparison
shopping, meta-search engines etc, and has been intensively studied. A number of approaches
have been proposed. As the structures of Web pages become more and more complex, the
existing approaches start to fail, and most of them do not remove irrelevant contents which
may a®ect the accuracy of data record extraction. We propose an automated approach for
Web data extraction. First, it makes use of visual features and query terms to identify data
sections and extracts data records in these sections. We also represent several content and
visual features of visual blocks in a data section, and use them to ¯lter out noisy blocks.
Second, it measures similarity between data items in di®erent data records based on their
visual and content features, and aligns them into di®erent groups so that the data in the
same group have the same semantics. The results of our experiments with a large set of
Web query result pages in di®erent domains show that our proposed approaches are highly
e®ective.
Resumo:
This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.
Resumo:
Children born very preterm, even when intelligence is broadly normal, often experience selective difficulties in executive function and visual-spatial processing. Development of structural cortical connectivity is known to be altered in this group, and functional magnetic resonance imaging (fMRI) evidence indicates that very preterm children recruit different patterns of functional connectivity between cortical regions during cognition. Synchronization of neural oscillations across brain areas has been proposed as a mechanism for dynamically assigning functional coupling to support perceptual and cognitive processing, but little is known about what role oscillatory synchronization may play in the altered neurocognitive development of very preterm children. To investigate this, we recorded magnetoencephalographic (MEG) activity while 7-8 year old children born very preterm and age-matched full-term controls performed a visual short-term memory task. Very preterm children exhibited reduced long-range synchronization in the alpha-band during visual short-term memory retention, indicating that cortical alpha rhythms may play a critical role in altered patterns functional connectivity expressed by this population during cognitive and perceptual processing. Long-range alpha-band synchronization was also correlated with task performance and visual-perceptual ability within the very preterm group, indicating that altered alpha oscillatory mechanisms mediating transient functional integration between cortical regions may be relevant to selective problems in neurocognitive development in this vulnerable population at school age.
Resumo:
Local alpha-band synchronization has been associated with both cortical idling and active inhibition. Recent evidence, however, suggests that long-range alpha synchronization increases functional coupling between cortical regions. We demonstrate increased long-range alpha and beta band phase synchronization during short-term memory retention in children 6-10 years of age. Furthermore, whereas alpha-band synchronization between posterior cortex and other regions is increased during retention, local alpha-band synchronization over posterior cortex is reduced. This constitutes a functional dissociation for alpha synchronization across local and long-range cortical scales. We interpret long-range synchronization as reflecting functional integration within a network of frontal and visual cortical regions. Local desynchronization of alpha rhythms over posterior cortex, conversely, likely arises because of increased engagement of visual cortex during retention.