16 resultados para V2

em Boston University Digital Common


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This report describes our attempt to add animation as another data type to be used on the World Wide Web. Our current network infrastructure, the Internet, is incapable of carrying video and audio streams for them to be used on the web for presentation purposes. In contrast, object-oriented animation proves to be efficient in terms of network resource requirements. We defined an animation model to support drawing-based and frame-based animation. We also extended the HyperText Markup Language in order to include this animation mode. BU-NCSA Mosanim, a modified version of the NCSA Mosaic for X(v2.5), is available to demonstrate the concept and potentials of animation in presentations an interactive game playing over the web.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A common design of an object recognition system has two steps, a detection step followed by a foreground within-class classification step. For example, consider face detection by a boosted cascade of detectors followed by face ID recognition via one-vs-all (OVA) classifiers. Another example is human detection followed by pose recognition. Although the detection step can be quite fast, the foreground within-class classification process can be slow and becomes a bottleneck. In this work, we formulate a filter-and-refine scheme, where the binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the FRGC V2 data set, hand shape detection and parameter estimation on a hand data set and vehicle detection and view angle estimation on a multi-view vehicle data set. On all data sets, our approach has comparable accuracy and is at least five times faster than the brute force approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A model of laminar visual cortical dynamics proposes how 3D boundary and surface representations of slated and curved 3D objects and 2D images arise. The 3D boundary representations emerge from interactions between non-classical horizontal receptive field interactions with intracorticcal and intercortical feedback circuits. Such non-classical interactions contextually disambiguate classical receptive field responses to ambiguous visual cues using cells that are sensitive to angles and disparity gradients with cortical areas V1 and V2. These cells are all variants of bipole grouping cells. Model simulations show how horizontal connections can develop selectively to angles, how slanted surfaces can activate 3D boundary representations that are sensitive to angles and disparity gradients, how 3D filling-in occurs across slanted surfaces, how a 2D Necker cube image can be represented in 3D, and how bistable Necker cuber percepts occur. The model also explains data about slant aftereffects and 3D neon color spreading. It shows how habituative transmitters that help to control developement also help to trigger bistable 3D percepts and slant aftereffects, and how attention can influence which of these percepts is perceived by propogating along some object boundaries.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

How does the laminar organization of cortical circuitry in areas VI and V2 give rise to 3D percepts of stratification, transparency, and neon color spreading in response to 2D pictures and 3D scenes? Psychophysical experiments have shown that such 3D percepts are sensitive to whether contiguous image regions have the same relative contrast polarity (dark-light or lightdark), yet long-range perceptual grouping is known to pool over opposite contrast polarities. The ocularity of contiguous regions is also critical for neon color spreading: Having different ocularity despite the contrast relationship that favors neon spreading blocks the spread. In addition, half visible points in a stereogram can induce near-depth transparency if the contrast relationship favors transparency in the half visible areas. It thus seems critical to have the whole contrast relationship in a monocular configuration, since splitting it between two stereogram images cancels the effect. What adaptive functions of perceptual grouping enable it to both preserve sensitivity to monocular contrast and also to pool over opposite contrasts? Aspects of cortical development, grouping, attention, perceptual learning, stereopsis and 3D planar surface perception have previously been analyzed using a 3D LAMINART model of cortical areas VI, V2, and V4. The present work consistently extends this model to show how like-polarity competition between VI simple cells in layer 4 may be combined with other LAMINART grouping mechanisms, such as cooperative pooling of opposite polarities at layer 2/3 complex cells. The model also explains how the Metelli Rules can lead to transparent percepts, how bistable transparency percepts can arise in which either surface can be perceived as transparent, and how such a transparency reversal can be facilitated by an attention shift. The like-polarity inhibition prediction is consistent with lateral masking experiments in which two f1anking Gabor patches with the same contrast polarity as the target increase the target detection threshold when they approach the target. It is also consistent with LAMINART simulations of cortical development. Other model explanations and testable predictions will also be presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? A 3D FORMOTION model specifies how 3D boundary representations, which separate figures from backgrounds within cortical area V2, capture motion signals at the appropriate depths in MT; how motion signals in MT disambiguate boundaries in V2 via MT-to-Vl-to-V2 feedback; how sparse feature tracking signals are amplified; and how a spatially anisotropic motion grouping process propagates across perceptual space via MT-MST feedback to integrate feature-tracking and ambiguous motion signals to determine a global object motion percept. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A neural model is developed to explain how humans can approach a goal object on foot while steering around obstacles to avoid collisions in a cluttered environment. The model uses optic flow from a 3D virtual reality environment to determine the position of objects based on motion discotinuities, and computes heading direction, or the direction of self-motion, from global optic flow. The cortical representation of heading interacts with the representations of a goal and obstacles such that the goal acts as an attractor of heading, while obstacles act as repellers. In addition the model maintains fixation on the goal object by generating smooth pursuit eye movements. Eye rotations can distort the optic flow field, complicating heading perception, and the model uses extraretinal signals to correct for this distortion and accurately represent heading. The model explains how motion processing mechanisms in cortical areas MT, MST, and VIP can be used to guide steering. The model quantitatively simulates human psychophysical data about visually-guided steering, obstacle avoidance, and route selection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Under natural viewing conditions, a single depthful percept of the world is consciously seen. When dissimilar images are presented to corresponding regions of the two eyes, binocular rivalyr may occur, during which the brain consciously perceives alternating percepts through time. How do the same brain mechanisms that generate a single depthful percept of the world also cause perceptual bistability, notably binocular rivalry? What properties of brain representations correspond to consciously seen percepts? A laminar cortical model of how cortical areas V1, V2, and V4 generate depthful percepts is developed to explain and quantitatively simulate binocualr rivalry data. The model proposes how mechanisms of cortical developement, perceptual grouping, and figure-ground perception lead to signle and rivalrous percepts. Quantitative model simulations include influences of contrast changes that are synchronized with switches in the dominant eye percept, gamma distribution of dominant phase durations, piecemeal percepts, and coexistence of eye-based and stimulus-based rivalry. The model also quantitatively explains data about multiple brain regions involved in rivalry, effects of object attention on switching between superimposed transparent surfaces, and monocular rivalry. These data explanations are linked to brain mechanisms that assure non-rivalrous conscious percepts. To our knowledge, no existing model can explain all of these phenomena.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article develops the Synchronous Matching Adaptive Resonance Theory (SMART) neural model to explain how the brain may coordinate multiple levels of thalamocortical and corticocortical processing to rapidly learn, and stably remember, important information about a changing world. The model clarifies how bottom-up and top-down processes work together to realize this goal, notably how processes of learning, expectation, attention, resonance, and synchrony are coordinated. The model hereby clarifies, for the first time, how the following levels of brain organization coexist to realize cognitive processing properties that regulate fast learning and stable memory of brain representations: single cell properties, such as spiking dynamics, spike-timing-dependent plasticity (STDP), and acetylcholine modulation; detailed laminar thalamic and cortical circuit designs and their interactions; aggregate cell recordings, such as current-source densities and local field potentials; and single cell and large-scale inter-areal oscillations in the gamma and beta frequency domains. In particular, the model predicts how laminar circuits of multiple cortical areas interact with primary and higher-order specific thalamic nuclei and nonspecific thalamic nuclei to carry out attentive visual learning and information processing. The model simulates how synchronization of neuronal spiking occurs within and across brain regions, and triggers STDP. Matches between bottom-up adaptively filtered input patterns and learned top-down expectations cause gamma oscillations that support attention, resonance, and learning. Mismatches inhibit learning while causing beta oscillations during reset and hypothesis testing operations that are initiated in the deeper cortical layers. The generality of learned recognition codes is controlled by a vigilance process mediated by acetylcholine.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A neural model is developed to explain how humans can approach a goal object on foot while steering around obstacles to avoid collisions in a cluttered environment. The model uses optic flow from a 3D virtual reality environment to determine the position of objects based on motion discontinuities, and computes heading direction, or the direction of self-motion, from global optic flow. The cortical representation of heading interacts with the representations of a goal and obstacles such that the goal acts as an attractor of heading, while obstacles act as repellers. In addition the model maintains fixation on the goal object by generating smooth pursuit eye movements. Eye rotations can distort the optic flow field, complicating heading perception, and the model uses extraretinal signals to correct for this distortion and accurately represent heading. The model explains how motion processing mechanisms in cortical areas MT, MST, and posterior parietal cortex can be used to guide steering. The model quantitatively simulates human psychophysical data about visually-guided steering, obstacle avoidance, and route selection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A neural model is presented of how cortical areas V1, V2, and V4 interact to convert a textured 2D image into a representation of curved 3D shape. Two basic problems are solved to achieve this: (1) Patterns of spatially discrete 2D texture elements are transformed into a spatially smooth surface representation of 3D shape. (2) Changes in the statistical properties of texture elements across space induce the perceived 3D shape of this surface representation. This is achieved in the model through multiple-scale filtering of a 2D image, followed by a cooperative-competitive grouping network that coherently binds texture elements into boundary webs at the appropriate depths using a scale-to-depth map and a subsequent depth competition stage. These boundary webs then gate filling-in of surface lightness signals in order to form a smooth 3D surface percept. The model quantitatively simulates challenging psychophysical data about perception of prolate ellipsoids (Todd and Akerstrom, 1987, J. Exp. Psych., 13, 242). In particular, the model represents a high degree of 3D curvature for a certain class of images, all of whose texture elements have the same degree of optical compression, in accordance with percepts of human observers. Simulations of 3D percepts of an elliptical cylinder, a slanted plane, and a photo of a golf ball are also presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? Consider, for example, a deer moving behind a bush. Here the partially occluded fragments of motion signals available to an observer must be coherently grouped into the motion of a single object. A 3D FORMOTION model comprises five important functional interactions involving the brain’s form and motion systems that address such situations. Because the model’s stages are analogous to areas of the primate visual system, we refer to the stages by corresponding anatomical names. In one of these functional interactions, 3D boundary representations, in which figures are separated from their backgrounds, are formed in cortical area V2. These depth-selective V2 boundaries select motion signals at the appropriate depths in MT via V2-to-MT signals. In another, motion signals in MT disambiguate locally incomplete or ambiguous boundary signals in V2 via MT-to-V1-to-V2 feedback. The third functional property concerns resolution of the aperture problem along straight moving contours by propagating the influence of unambiguous motion signals generated at contour terminators or corners. Here, sparse “feature tracking signals” from, e.g., line ends, are amplified to overwhelm numerically superior ambiguous motion signals along line segment interiors. In the fourth, a spatially anisotropic motion grouping process takes place across perceptual space via MT-MST feedback to integrate veridical feature-tracking and ambiguous motion signals to determine a global object motion percept. The fifth property uses the MT-MST feedback loop to convey an attentional priming signal from higher brain areas back to V1 and V2. The model's use of mechanisms such as divisive normalization, endstopping, cross-orientation inhibition, and longrange cooperation is described. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose that a simple, closed-form mathematical expression--the Wedge-Dipole mapping--provides a concise approximation to the full-field, two-dimensional topographic structure of macaque V1, V2, and V3. A single map function, which we term a map complex, acts as a simultaneous descriptor of all three areas. Quantitative estimation of the Wedge-Dipole parameters is provided via 2DG data of central-field V1 topography and a publicly available data set of full-field macaque V1 and V2 topography. Good quantitative agreement is obtained between the data and the model presented here. The increasing importance of fMRI-based brain imaging motivates the development of more sophisticated two-dimensional models of cortical visuotopy, in contrast to the one-dimensional approximations that have been in common use. One reason is that topography has traditionally supplied an important aspect of "ground truth", or validation, for brain imaging, suggesting that further development of high-resolution fMRI will be facilitated by this data analysis. In addition, several important insights into the nature of cortical topography follows from this work. The presence of anisotropy in cortical magnification factor is shown to follow mathematically from the shared boundary conditions at the V1-V2 and V2-V3 borders, and therefore may not causally follow from the existence of columnar systems in these areas, as is widely assumed. An application of the Wedge-Dipole model to localizing aspects of visual processing to specific cortical areas--extending previous work in correlating V1 cortical magnification factor to retinal anatomy or visual psychophysics data--is briefly discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Under natural viewing conditions, a single depthful percept of the world is consciously seen. When dissimilar images are presented to corresponding regions of the two eyes, binocular rivalry may occur, during which the brain consciously perceives alternating percepts through time. Perceptual bistability can also occur in response to a single ambiguous figure. These percepts raise basic questions: What brain mechanisms generate a single depthful percept of the world? How do the same mechanisms cause perceptual bistability, notably binocular rivalry? What properties of brain representations correspond to consciously seen percepts? How do the dynamics of the layered circuits of visual cortex generate single and bistable percepts? A laminar cortical model of how cortical areas V1, V2, and V4 generate depthful percepts is developed to explain and quantitatively simulate binocular rivalry data. The model proposes how mechanisms of cortical development, perceptual grouping, and figure-ground perception lead to single and rivalrous percepts.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article describes further evidence for a new neural network theory of biological motion perception that is called a Motion Boundary Contour System. This theory clarifies why parallel streams Vl-> V2 and Vl-> MT exist for static form and motion form processing among the areas Vl, V2, and MT of visual cortex. The Motion Boundary Contour System consists of several parallel copies, such that each copy is activated by a different range of receptive field sizes. Each copy is further subdivided into two hierarchically organized subsystems: a Motion Oriented Contrast Filter, or MOC Filter, for preprocessing moving images; and a Cooperative-Competitive Feedback Loop, or CC Loop, for generating emergent boundary segmentations of the filtered signals. The present article uses the MOC Filter to explain a variety of classical and recent data about short-range and long-range apparent motion percepts that have not yet been explained by alternative models. These data include split motion; reverse-contrast gamma motion; delta motion; visual inertia; group motion in response to a reverse-contrast Ternus display at short interstimulus intervals; speed-up of motion velocity as interfiash distance increases or flash duration decreases; dependence of the transition from element motion to group motion on stimulus duration and size; various classical dependencies between flash duration, spatial separation, interstimulus interval, and motion threshold known as Korte's Laws; and dependence of motion strength on stimulus orientation and spatial frequency. These results supplement earlier explanations by the model of apparent motion data that other models have not explained; a recent proposed solution of the global aperture problem, including explanations of motion capture and induced motion; an explanation of how parallel cortical systems for static form perception and motion form perception may develop, including a demonstration that these parallel systems are variations on a common cortical design; an explanation of why the geometries of static form and motion form differ, in particular why opposite orientations differ by 90°, whereas opposite directions differ by 180°, and why a cortical stream Vl -> V2 -> MT is needed; and a summary of how the main properties of other motion perception models can be assimilated into different parts of the Motion Boundary Contour System design.