5 resultados para Invariants
em Massachusetts Institute of Technology
Resumo:
Structure from motion often refers to the computation of 3D structure from a matched sequence of images. However, a depth map of a surface is difficult to compute and may not be a good representation for storage and recognition. Given matched images, I will first show that the sign of the normal curvature in a given direction at a given point in the image can be computed from a simple difference of slopes of line-segments in one image. Using this result, local surface patches can be classified as convex, concave, parabolic (cylindrical), hyperbolic (saddle point) or planar. At the same time the translational component of the optical flow is obtained, from which the focus of expansion can be computed.
Resumo:
A study is made of the recognition and transformation of figures by iterative arrays of finite state automata. A figure is a finite rectangular two-dimensional array of symbols. The iterative arrays considered are also finite, rectangular, and two-dimensional. The automata comprising any given array are called cells and are assumed to be isomorphic and to operate synchronously with the state of a cell at time t+1 being a function of the states of it and its four nearest neighbors at time t. At time t=0 each cell is placed in one of a fixed number of initial states. The pattern of initial states thus introduced represents the figure to be processed. The resulting sequence of array states represents a computation based on the input figure. If one waits for a specially designated cell to indicate acceptance or rejection of the figure, the array is said to be working on a recognition problem. If one waits for the array to come to a stable configuration representing an output figure, the array is said to be working on a transformation problem.
Resumo:
In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct method that cuts through the computations of camera transformation, scene structure and epipolar geometry. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in re-projection tasks.
Resumo:
We discuss a strategy for visual recognition by forming groups of salient image features, and then using these groups to index into a data base to find all of the matching groups of model features. We discuss the most space efficient possible method of representing 3-D models for indexing from 2-D data, and show how to account for sensing error when indexing. We also present a convex grouping method that is robust and efficient, both theoretically and in practice. Finally, we combine these modules into a complete recognition system, and test its performance on many real images.
Resumo:
We investigate the differences --- conceptually and algorithmically --- between affine and projective frameworks for the tasks of visual recognition and reconstruction from perspective views. It is shown that an affine invariant exists between any view and a fixed view chosen as a reference view. This implies that for tasks for which a reference view can be chosen, such as in alignment schemes for visual recognition, projective invariants are not really necessary. We then use the affine invariant to derive new algebraic connections between perspective views. It is shown that three perspective views of an object are connected by certain algebraic functions of image coordinates alone (no structure or camera geometry needs to be involved).