7 resultados para human visual masking

em Boston University Digital Common


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

SyNAPSE program of the Defense Advanced Projects Research Agency (HRL Laboratories LLC, subcontract #801881-BS under DARPA prime contract HR0011-09-C-0001); CELEST, a National Science Foundation Science of Learning Center (SBE-0354378)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A non-linear supervised learning architecture, the Specialized Mapping Architecture (SMA) and its application to articulated body pose reconstruction from single monocular images is described. The architecture is formed by a number of specialized mapping functions, each of them with the purpose of mapping certain portions (connected or not) of the input space, and a feedback matching process. A probabilistic model for the architecture is described along with a mechanism for learning its parameters. The learning problem is approached using a maximum likelihood estimation framework; we present Expectation Maximization (EM) algorithms for two different instances of the likelihood probability. Performance is characterized by estimating human body postures from low level visual features, showing promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many people suffer from conditions that lead to deterioration of motor control and makes access to the computer using traditional input devices difficult. In particular, they may loose control of hand movement to the extent that the standard mouse cannot be used as a pointing device. Most current alternatives use markers or specialized hardware to track and translate a user's movement to pointer movement. These approaches may be perceived as intrusive, for example, wearable devices. Camera-based assistive systems that use visual tracking of features on the user's body often require cumbersome manual adjustment. This paper introduces an enhanced computer vision based strategy where features, for example on a user's face, viewed through an inexpensive USB camera, are tracked and translated to pointer movement. The main contributions of this paper are (1) enhancing a video based interface with a mechanism for mapping feature movement to pointer movement, which allows users to navigate to all areas of the screen even with very limited physical movement, and (2) providing a customizable, hierarchical navigation framework for human computer interaction (HCI). This framework provides effective use of the vision-based interface system for accessing multiple applications in an autonomous setting. Experiments with several users show the effectiveness of the mapping strategy and its usage within the application framework as a practical tool for desktop users with disabilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Previous studies have reported considerable intersubject variability in the three-dimensional geometry of the human primary visual cortex (V1). Here we demonstrate that much of this variability is due to extrinsic geometric features of the cortical folds, and that the intrinsic shape of V1 is similar across individuals. V1 was imaged in ten ex vivo human hemispheres using high-resolution (200 μm) structural magnetic resonance imaging at high field strength (7 T). Manual tracings of the stria of Gennari were used to construct a surface representation, which was computationally flattened into the plane with minimal metric distortion. The instrinsic shape of V1 was determined from the boundary of the planar representation of the stria. An ellipse provided a simple parametric shape model that was a good approximation to the boundary of flattened V1. The aspect ration of the best-fitting ellipse was found to be consistent across subject, with a mean of 1.85 and standard deviation of 0.12. Optimal rigid alignment of size-normalized V1 produced greater overlap than that achieved by previous studies using different registration methods. A shape analysis of published macaque data indicated that the intrinsic shape of macaque V1 is also stereotyped, and similar to the human V1 shape. Previoud measurements of the functional boundary of V1 in human and macaque are in close agreement with these results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An analysis of the reset of visual cortical circuits responsible for the binding or segmentation of visual features into coherent visual forms yields a model that explains properties of visual persistence. The reset mechanisms prevent massive smearing or visual percepts in response to rapidly moving images. The model simulates relationships among psychophysical data showing inverse relations of persistence to flash luminance and duration, greaterr persistence of illusory contours than real contours, a U-shaped temporal function for persistence of illusory contours, a reduction of persistence: due to adaptation with a stimulus of like orientation, an increase or persistence due to adaptation with a stimulus of perpendicular orientation, and an increase of persistence with spatial separation of a masking stimulus. The model suggests that a combination of habituative, opponent, and endstopping mechanisms prevent smearing and limit persistence. Earlier work with the model has analyzed data about boundary formation, texture segregation, shape-from-shading, and figure-ground separation. Thus, several types of data support each model mechanism and new predictions are made.