2 resultados para 13077-021

em Massachusetts Institute of Technology


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a unifying framework in which "object-independent" modes of variation are learned from continuous-time data such as video sequences. These modes of variation can be used as "generators" to produce a manifold of images of a new object from a single example of that object. We develop the framework in the context of a well-known example: analyzing the modes of spatial deformations of a scene under camera movement. Our method learns a close approximation to the standard affine deformations that are expected from the geometry of the situation, and does so in a completely unsupervised (i.e. ignorant of the geometry of the situation) fashion. We stress that it is learning a "parameterization", not just the parameter values, of the data. We then demonstrate how we have used the same framework to derive a novel data-driven model of joint color change in images due to common lighting variations. The model is superior to previous models of color change in describing non-linear color changes due to lighting.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The question of how shape is represented is of central interest to understanding visual processing in cortex. While tuning properties of the cells in early part of the ventral visual stream, thought to be responsible for object recognition in the primate, are comparatively well understood, several different theories have been proposed regarding tuning in higher visual areas, such as V4. We used the model of object recognition in cortex presented by Riesenhuber and Poggio (1999), where more complex shape tuning in higher layers is the result of combining afferent inputs tuned to simpler features, and compared the tuning properties of model units in intermediate layers to those of V4 neurons from the literature. In particular, we investigated the issue of shape representation in visual area V1 and V4 using oriented bars and various types of gratings (polar, hyperbolic, and Cartesian), as used in several physiology experiments. Our computational model was able to reproduce several physiological findings, such as the broadening distribution of the orientation bandwidths and the emergence of a bias toward non-Cartesian stimuli. Interestingly, the simulation results suggest that some V4 neurons receive input from afferents with spatially separated receptive fields, leading to experimentally testable predictions. However, the simulations also show that the stimulus set of Cartesian and non-Cartesian gratings is not sufficiently complex to probe shape tuning in higher areas, necessitating the use of more complex stimulus sets.