14 resultados para Rigid
em Massachusetts Institute of Technology
Resumo:
We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non-cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures.
Resumo:
The registration of pre-operative volumetric datasets to intra- operative two-dimensional images provides an improved way of verifying patient position and medical instrument loca- tion. In applications from orthopedics to neurosurgery, it has a great value in maintaining up-to-date information about changes due to intervention. We propose a mutual information- based registration algorithm to establish the proper align- ment. For optimization purposes, we compare the perfor- mance of the non-gradient Powell method and two slightly di erent versions of a stochastic gradient ascent strategy: one using a sparsely sampled histogramming approach and the other Parzen windowing to carry out probability density approximation. Our main contribution lies in adopting the stochastic ap- proximation scheme successfully applied in 3D-3D registra- tion problems to the 2D-3D scenario, which obviates the need for the generation of full DRRs at each iteration of pose op- timization. This facilitates a considerable savings in compu- tation expense. We also introduce a new probability density estimator for image intensities via sparse histogramming, de- rive gradient estimates for the density measures required by the maximization procedure and introduce the framework for a multiresolution strategy to the problem. Registration results are presented on uoroscopy and CT datasets of a plastic pelvis and a real skull, and on a high-resolution CT- derived simulated dataset of a real skull, a plastic skull, a plastic pelvis and a plastic lumbar spine segment.
Resumo:
In this paper we present an approach to perceptual organization and attention based on Curved Inertia Frames (C.I.F.), a novel definition of "curved axis of inertia'' tolerant to noisy and spurious data. The definition is useful because it can find frames that correspond to large, smooth, convex, symmetric and central parts. It is novel because it is global and can detect curved axes. We discuss briefly the relation to human perception, the recognition of non-rigid objects, shape description, and extensions to finding "features", inside/outside relations, and long- smooth ridges in arbitrary surfaces.
Resumo:
Visual object recognition requires the matching of an image with a set of models stored in memory. In this paper we propose an approach to recognition in which a 3-D object is represented by the linear combination of 2-D images of the object. If M = {M1,...Mk} is the set of pictures representing a given object, and P is the 2-D image of an object to be recognized, then P is considered an instance of M if P = Eki=aiMi for some constants ai. We show that this approach handles correctly rigid 3-D transformations of objects with sharp as well as smooth boundaries, and can also handle non-rigid transformations. The paper is divided into two parts. In the first part we show that the variety of views depicting the same object under different transformations can often be expressed as the linear combinations of a small number of views. In the second part we suggest how this linear combinatino property may be used in the recognition process.
Resumo:
A polynomial time algorithm (pruned correspondence search, PCS) with good average case performance for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3D objects from a single 2D image, is presented. Efficient verification algorithms, based on a linear representation of location constraints, are given for the case of affine transformations among vector spaces and for the case of rigid 2D and 3D transformations with scale. Some preliminary experiments suggest that PCS is a practical algorithm. Its similarity to existing correspondence based algorithms means that a number of existing techniques for speedup can be incorporated into PCS to improve its performance.
Resumo:
The task of shape recovery from a motion sequence requires the establishment of correspondence between image points. The two processes, the matching process and the shape recovery one, are traditionally viewed as independent. Yet, information obtained during the process of shape recovery can be used to guide the matching process. This paper discusses the mutual relationship between the two processes. The paper is divided into two parts. In the first part we review the constraints imposed on the correspondence by rigid transformations and extend them to objects that undergo general affine (non rigid) transformation (including stretch and shear), as well as to rigid objects with smooth surfaces. In all these cases corresponding points lie along epipolar lines, and these lines can be recovered from a small set of corresponding points. In the second part of the paper we discuss the potential use of epipolar lines in the matching process. We present an algorithm that recovers the correspondence from three contour images. The algorithm was implemented and used to construct object models for recognition. In addition we discuss how epipolar lines can be used to solve the aperture problem.
Resumo:
This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.
Resumo:
We provide a theory of the three-dimensional interpretation of a class of line-drawings called p-images, which are interpreted by the human vision system as parallelepipeds ("boxes"). Despite their simplicity, p-images raise a number of interesting vision questions: *Why are p-images seen as three-dimensional objects? Why not just as flatimages? *What are the dimensions and pose of the perceived objects? *Why are some p-images interpreted as rectangular boxes, while others are seen as skewed, even though there is no obvious distinction between the images? *When p-images are rotated in three dimensions, why are the image-sequences perceived as distorting objects---even though structure-from-motion would predict that rigid objects would be seen? *Why are some three-dimensional parallelepipeds seen as radically different when viewed from different viewpoints? We show that these and related questions can be answered with the help of a single mathematical result and an associated perceptual principle. An interesting special case arises when there are right angles in the p-image. This case represents a singularity in the equations and is mystifying from the vision point of view. It would seem that (at least in this case) the vision system does not follow the ordinary rules of geometry but operates in accordance with other (and as yet unknown) principles.
Resumo:
We describe a new method for motion estimation and 3D reconstruction from stereo image sequences obtained by a stereo rig moving through a rigid world. We show that given two stereo pairs one can compute the motion of the stereo rig directly from the image derivatives (spatial and temporal). Correspondences are not required. One can then use the images from both pairs combined to compute a dense depth map. The motion estimates between stereo pairs enable us to combine depth maps from all the pairs in the sequence to form an extended scene reconstruction and we show results from a real image sequence. The motion computation is a linear least squares computation using all the pixels in the image. Areas with little or no contrast are implicitly weighted less so one does not have to explicitly apply a confidence measure.
Resumo:
Stereopsis and motion parallax are two methods for recovering three dimensional shape. Theoretical analyses of each method show that neither alone can recover rigid 3D shapes correctly unless other information, such as perspective, is included. The solutions for recovering rigid structure from motion have a reflection ambiguity; the depth scale of the stereoscopic solution will not be known unless the fixation distance is specified in units of interpupil separation. (Hence the configuration will appear distorted.) However, the correct configuration and the disposition of a rigid 3D shape can be recovered if stereopsis and motion are integrated, for then a unique solution follows from a set of linear equations. The correct interpretation requires only three points and two stereo views.
Resumo:
Cyclic changes in the shape of a quasi-rigid body on a curved manifold can lead to net translation and/or rotation of the body in the manifold. Presuming space-time is a curved manifold as portrayed by general relativity, translation in space can be accomplished simply by cyclic changes in the shape of a body, without any thrust or external forces.
Resumo:
On October 19-22, 1997 the Second PHANToM Users Group Workshop was held at the MIT Endicott House in Dedham, Massachusetts. Designed as a forum for sharing results and insights, the workshop was attended by more than 60 participants from 7 countries. These proceedings report on workshop presentations in diverse areas including rigid and compliant rendering, tool kits, development environments, techniques for scientific data visualization, multi-modal issues and a programming tutorial.
Resumo:
The goal of this thesis is to apply the computational approach to motor learning, i.e., describe the constraints that enable performance improvement with experience and also the constraints that must be satisfied by a motor learning system, describe what is being computed in order to achieve learning, and why it is being computed. The particular tasks used to assess motor learning are loaded and unloaded free arm movement, and the thesis includes work on rigid body load estimation, arm model estimation, optimal filtering for model parameter estimation, and trajectory learning from practice. Learning algorithms have been developed and implemented in the context of robot arm control. The thesis demonstrates some of the roles of knowledge in learning. Powerful generalizations can be made on the basis of knowledge of system structure, as is demonstrated in the load and arm model estimation algorithms. Improving the performance of parameter estimation algorithms used in learning involves knowledge of the measurement noise characteristics, as is shown in the derivation of optimal filters. Using trajectory errors to correct commands requires knowledge of how command errors are transformed into performance errors, i.e., an accurate model of the dynamics of the controlled system, as is demonstrated in the trajectory learning work. The performance demonstrated by the algorithms developed in this thesis should be compared with algorithms that use less knowledge, such as table based schemes to learn arm dynamics, previous single trajectory learning algorithms, and much of traditional adaptive control.
Resumo:
The motion planning problem is of central importance to the fields of robotics, spatial planning, and automated design. In robotics we are interested in the automatic synthesis of robot motions, given high-level specifications of tasks and geometric models of the robot and obstacles. The Mover's problem is to find a continuous, collision-free path for a moving object through an environment containing obstacles. We present an implemented algorithm for the classical formulation of the three-dimensional Mover's problem: given an arbitrary rigid polyhedral moving object P with three translational and three rotational degrees of freedom, find a continuous, collision-free path taking P from some initial configuration to a desired goal configuration. This thesis describes the first known implementation of a complete algorithm (at a given resolution) for the full six degree of freedom Movers' problem. The algorithm transforms the six degree of freedom planning problem into a point navigation problem in a six-dimensional configuration space (called C-Space). The C-Space obstacles, which characterize the physically unachievable configurations, are directly represented by six-dimensional manifolds whose boundaries are five dimensional C-surfaces. By characterizing these surfaces and their intersections, collision-free paths may be found by the closure of three operators which (i) slide along 5-dimensional intersections of level C-Space obstacles; (ii) slide along 1- to 4-dimensional intersections of level C-surfaces; and (iii) jump between 6 dimensional obstacles. Implementing the point navigation operators requires solving fundamental representational and algorithmic questions: we will derive new structural properties of the C-Space constraints and shoe how to construct and represent C-Surfaces and their intersection manifolds. A definition and new theoretical results are presented for a six-dimensional C-Space extension of the generalized Voronoi diagram, called the C-Voronoi diagram, whose structure we relate to the C-surface intersection manifolds. The representations and algorithms we develop impact many geometric planning problems, and extend to Cartesian manipulators with six degrees of freedom.