9 resultados para Eyes
em Boston University Digital Common
Resumo:
We designed the Eyebrow-Clicker, a camera-based human computer interface system that implements a new form of binary switch. When the user raises his or her eyebrows, the binary switch is activated and a selection command is issued. The Eyebrow-Clicker thus replaces the "click" functionality of a mouse. The system initializes itself by detecting the user's eyes and eyebrows, tracks these features at frame rate, and recovers in the event of errors. The initialization uses the natural blinking of the human eye to select suitable templates for tracking. Once execution has begun, a user therefore never has to restart the program or even touch the computer. In our experiments with human-computer interaction software, the system successfully determined 93% of the time when a user raised his eyebrows.
Resumo:
A human-computer interface (HCI) system designed for use by people with severe disabilities is presented. People that are severely paralyzed or afflicted with diseases such as ALS (Lou Gehrig's disease) or multiple sclerosis are unable to move or control any parts of their bodies except for their eyes. The system presented here detects the user's eye blinks and analyzes the pattern and duration of the blinks, using them to provide input to the computer in the form of a mouse click. After the automatic initialization of the system occurs from the processing of the user's involuntary eye blinks in the first few seconds of use, the eye is tracked in real time using correlation with an online template. If the user's depth changes significantly or rapid head movement occurs, the system is automatically reinitialized. There are no lighting requirements nor offline templates needed for the proper functioning of the system. The system works with inexpensive USB cameras and runs at a frame rate of 30 frames per second. Extensive experiments were conducted to determine both the system's accuracy in classifying voluntary and involuntary blinks, as well as the system's fitness in varying environment conditions, such as alternative camera placements and different lighting conditions. These experiments on eight test subjects yielded an overall detection accuracy of 95.3%.
Resumo:
Under natural viewing conditions, a single depthful percept of the world is consciously seen. When dissimilar images are presented to corresponding regions of the two eyes, binocular rivalyr may occur, during which the brain consciously perceives alternating percepts through time. How do the same brain mechanisms that generate a single depthful percept of the world also cause perceptual bistability, notably binocular rivalry? What properties of brain representations correspond to consciously seen percepts? A laminar cortical model of how cortical areas V1, V2, and V4 generate depthful percepts is developed to explain and quantitatively simulate binocualr rivalry data. The model proposes how mechanisms of cortical developement, perceptual grouping, and figure-ground perception lead to signle and rivalrous percepts. Quantitative model simulations include influences of contrast changes that are synchronized with switches in the dominant eye percept, gamma distribution of dominant phase durations, piecemeal percepts, and coexistence of eye-based and stimulus-based rivalry. The model also quantitatively explains data about multiple brain regions involved in rivalry, effects of object attention on switching between superimposed transparent surfaces, and monocular rivalry. These data explanations are linked to brain mechanisms that assure non-rivalrous conscious percepts. To our knowledge, no existing model can explain all of these phenomena.
Resumo:
A neural model is described of how the brain may autonomously learn a body-centered representation of 3-D target position by combining information about retinal target position, eye position, and head position in real time. Such a body-centered spatial representation enables accurate movement commands to the limbs to be generated despite changes in the spatial relationships between the eyes, head, body, and limbs through time. The model learns a vector representation--otherwise known as a parcellated distributed representation--of target vergence with respect to the two eyes, and of the horizontal and vertical spherical angles of the target with respect to a cyclopean egocenter. Such a vergence-spherical representation has been reported in the caudal midbrain and medulla of the frog, as well as in psychophysical movement studies in humans. A head-centered vergence-spherical representation of foveated target position can be generated by two stages of opponent processing that combine corollary discharges of outflow movement signals to the two eyes. Sums and differences of opponent signals define angular and vergence coordinates, respectively. The head-centered representation interacts with a binocular visual representation of non-foveated target position to learn a visuomotor representation of both foveated and non-foveated target position that is capable of commanding yoked eye movementes. This head-centered vector representation also interacts with representations of neck movement commands to learn a body-centered estimate of target position that is capable of commanding coordinated arm movements. Learning occurs during head movements made while gaze remains fixed on a foveated target. An initial estimate is stored and a VOR-mediated gating signal prevents the stored estimate from being reset during a gaze-maintaining head movement. As the head moves, new estimates arc compared with the stored estimate to compute difference vectors which act as error signals that drive the learning process, as well as control the on-line merging of multimodal information.
Resumo:
Under natural viewing conditions, a single depthful percept of the world is consciously seen. When dissimilar images are presented to corresponding regions of the two eyes, binocular rivalry may occur, during which the brain consciously perceives alternating percepts through time. Perceptual bistability can also occur in response to a single ambiguous figure. These percepts raise basic questions: What brain mechanisms generate a single depthful percept of the world? How do the same mechanisms cause perceptual bistability, notably binocular rivalry? What properties of brain representations correspond to consciously seen percepts? How do the dynamics of the layered circuits of visual cortex generate single and bistable percepts? A laminar cortical model of how cortical areas V1, V2, and V4 generate depthful percepts is developed to explain and quantitatively simulate binocular rivalry data. The model proposes how mechanisms of cortical development, perceptual grouping, and figure-ground perception lead to single and rivalrous percepts.
Resumo:
Our eyes are constantly in motion. Even during visual fixation, small eye movements continually jitter the location of gaze. It is known that visual percepts tend to fade when retinal image motion is eliminated in the laboratory. However, it has long been debated whether, during natural viewing, fixational eye movements have functions in addition to preventing the visual scene from fading. In this study, we analysed the influence in humans of fixational eye movements on the discrimination of gratings masked by noise that has a power spectrum similar to that of natural images. Using a new method of retinal image stabilization18, we selectively eliminated the motion of the retinal image that normally occurs during the intersaccadic intervals of visual fixation. Here we show that fixational eye movements improve discrimination of high spatial frequency stimuli, but not of low spatial frequency stimuli. This improvement originates from the temporal modulations introduced by fixational eye movements in the visual input to the retina, which emphasize the high spatial frequency harmonics of the stimulus. In a natural visual world dominated by low spatial frequencies, fixational eye movements appear to constitute an effective sampling strategy by which the visual system enhances the processing of spatial detail.
Resumo:
When we look at a scene, how do we consciously see surfaces infused with lightness and color at the correct depths? Random Dot Stereograms (RDS) probe how binocular disparity between the two eyes can generate such conscious surface percepts. Dense RDS do so despite the fact that they include multiple false binocular matches. Sparse stereograms do so even across large contrast-free regions with no binocular matches. Stereograms that define occluding and occluded surfaces lead to surface percepts wherein partially occluded textured surfaces are completed behind occluding textured surfaces at a spatial scale much larger than that of the texture elements themselves. Earlier models suggest how the brain detects binocular disparity, but not how RDS generate conscious percepts of 3D surfaces. A neural model predicts how the layered circuits of visual cortex generate these 3D surface percepts using interactions between visual boundary and surface representations that obey complementary computational rules.
Resumo:
This article describes neural network models for adaptive control of arm movement trajectories during visually guided reaching and, more generally, a framework for unsupervised real-time error-based learning. The models clarify how a child, or untrained robot, can learn to reach for objects that it sees. Piaget has provided basic insights with his concept of a circular reaction: As an infant makes internally generated movements of its hand, the eyes automatically follow this motion. A transformation is learned between the visual representation of hand position and the motor representation of hand position. Learning of this transformation eventually enables the child to accurately reach for visually detected targets. Grossberg and Kuperstein have shown how the eye movement system can use visual error signals to correct movement parameters via cerebellar learning. Here it is shown how endogenously generated arm movements lead to adaptive tuning of arm control parameters. These movements also activate the target position representations that are used to learn the visuo-motor transformation that controls visually guided reaching. The AVITE model presented here is an adaptive neural circuit based on the Vector Integration to Endpoint (VITE) model for arm and speech trajectory generation of Bullock and Grossberg. In the VITE model, a Target Position Command (TPC) represents the location of the desired target. The Present Position Command (PPC) encodes the present hand-arm configuration. The Difference Vector (DV) population continuously.computes the difference between the PPC and the TPC. A speed-controlling GO signal multiplies DV output. The PPC integrates the (DV)·(GO) product and generates an outflow command to the arm. Integration at the PPC continues at a rate dependent on GO signal size until the DV reaches zero, at which time the PPC equals the TPC. The AVITE model explains how self-consistent TPC and PPC coordinates are autonomously generated and learned. Learning of AVITE parameters is regulated by activation of a self-regulating Endogenous Random Generator (ERG) of training vectors. Each vector is integrated at the PPC, giving rise to a movement command. The generation of each vector induces a complementary postural phase during which ERG output stops and learning occurs. Then a new vector is generated and the cycle is repeated. This cyclic, biphasic behavior is controlled by a specialized gated dipole circuit. ERG output autonomously stops in such a way that, across trials, a broad sample of workspace target positions is generated. When the ERG shuts off, a modulator gate opens, copying the PPC into the TPC. Learning of a transformation from TPC to PPC occurs using the DV as an error signal that is zeroed due to learning. This learning scheme is called a Vector Associative Map, or VAM. The VAM model is a general-purpose device for autonomous real-time error-based learning and performance of associative maps. The DV stage serves the dual function of reading out new TPCs during performance and reading in new adaptive weights during learning, without a disruption of real-time operation. YAMs thus provide an on-line unsupervised alternative to the off-line properties of supervised error-correction learning algorithms. YAMs and VAM cascades for learning motor-to-motor and spatial-to-motor maps are described. YAM models and Adaptive Resonance Theory (ART) models exhibit complementary matching, learning, and performance properties that together provide a foundation for designing a total sensory-cognitive and cognitive-motor autonomous system.
Resumo:
This article describes further evidence for a new neural network theory of biological motion perception. The theory clarifies why parallel streams Vl --> V2, Vl --> MT, and Vl --> V2 --> MT exist for static form and motion form processing among the areas Vl, V2, and MT of visual cortex. The theory suggests that the static form system (Static BCS) generates emergent boundary segmentations whose outputs are insensitive to direction-ofcontrast and insensitive to direction-of-motion, whereas the motion form system (Motion BCS) generates emergent boundary segmentations whose outputs are insensitive to directionof-contrast but sensitive to direction-of-motion. The theory is used to explain classical and recent data about short-range and long-range apparent motion percepts that have not yet been explained by alternative models. These data include beta motion; split motion; gamma motion and reverse-contrast gamma motion; delta motion; visual inertia; the transition from group motion to element motion in response to a Ternus display as the interstimulus interval (ISI) decreases; group motion in response to a reverse-contrast Ternus display even at short ISIs; speed-up of motion velocity as interflash distance increases or flash duration decreases; dependence of the transition from element motion to group motion on stimulus duration and size; various classical dependencies between flash duration, spatial separation, ISI, and motion threshold known as Korte's Laws; dependence of motion strength on stimulus orientation and spatial frequency; short-range and long-range form-color interactions; and binocular interactions of flashes to different eyes.