18 resultados para 3D motion trajectory

em Massachusetts Institute of Technology


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the differences --- conceptually and algorithmically --- between affine and projective frameworks for the tasks of visual recognition and reconstruction from perspective views. It is shown that an affine invariant exists between any view and a fixed view chosen as a reference view. This implies that for tasks for which a reference view can be chosen, such as in alignment schemes for visual recognition, projective invariants are not really necessary. We then use the affine invariant to derive new algebraic connections between perspective views. It is shown that three perspective views of an object are connected by certain algebraic functions of image coordinates alone (no structure or camera geometry needs to be involved).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The flexibility of the robot is the key to its success as a viable aid to production. Flexibility of a robot can be explained in two directions. The first is to increase the physical generality of the robot such that it can be easily reconfigured to handle a wide variety of tasks. The second direction is to increase the ability of the robot to interact with its environment such that tasks can still be successfully completed in the presence of uncertainties. The use of articulated hands are capable of adapting to a wide variety of grasp shapes, hence reducing the need for special tooling. The availability of low mass, high bandwidth points close to the manipulated object also offers significant improvements I the control of fine motions. This thesis provides a framework for using articulated hands to perform local manipulation of objects. N particular, it addresses the issues in effecting compliant motions of objects in Cartesian space. The Stanford/JPL hand is used as an example to illustrate a number of concepts. The examples provide a unified methodology for controlling articulated hands grasping with point contacts. We also present a high-level hand programming system based on the methodologies developed in this thesis. Compliant motion of grasped objects and dexterous manipulations can be easily described in the LISP-based hand programming language.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new formulation for recovering the structure and motion parameters of a moving patch using both motion and shading information is presented. It is based on a new differential constraint equation (FICE) that links the spatiotemporal gradients of irradiance to the motion and structure parameters and the temporal variations of the surface shading. The FICE separates the contribution to the irradiance spatiotemporal gradients of the gradients due to texture from those due to shading and allows the FICE to be used for textured and textureless surface. The new approach, combining motion and shading information, leads directly to two different contributions: it can compensate for the effects of shading variations in recovering the shape and motion; and it can exploit the shading/illumination effects to recover motion and shape when they cannot be recovered without it. The FICE formulation is also extended to multiple frames.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In model-based vision, there are a huge number of possible ways to match model features to image features. In addition to model shape constraints, there are important match-independent constraints that can efficiently reduce the search without the combinatorics of matching. I demonstrate two specific modules in the context of a complete recognition system, Reggie. The first is a region-based grouping mechanism to find groups of image features that are likely to come from a single object. The second is an interpretive matching scheme to make explicit hypotheses about occlusion and instabilities in the image features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many motion-vision scenarios, a camera (mounted on a moving vehicle) takes images of an environment to find the "motion'' and shape. We introduce a direct-method called fixation for solving this motion-vision problem in its general case. Fixation uses neither feature-correspondence nor optical-flow. Instead, spatio-temporal brightness gradients are used directly. In contrast to previous direct methods, fixation does not restrict the motion or the environment. Moreover, fixation method neither requires tracked images as its input nor uses mechanical tracking for obtaining fixated images. The experimental results on real images are presented and the implementation issues and techniques are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The report addresses the problem of visual recognition under two sources of variability: geometric and photometric. The geometric deals with the relation between 3D objects and their views under orthographic and perspective projection. The photometric deals with the relation between 3D matte objects and their images under changing illumination conditions. Taken together, an alignment-based method is presented for recognizing objects viewed from arbitrary viewing positions and illuminated by arbitrary settings of light sources.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report presents a set of representations methodologies and tools for the purpose of visualizing, analyzing and designing functional shapes in terms of constraints on motion. The core of the research is an interactive computational environment that provides an explicit visual representation of motion constraints produced by shape interactions, and a series of tools that allow for the manipulation of motion constraints and their underlying shapes for the purpose of design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The registration of pre-operative volumetric datasets to intra- operative two-dimensional images provides an improved way of verifying patient position and medical instrument loca- tion. In applications from orthopedics to neurosurgery, it has a great value in maintaining up-to-date information about changes due to intervention. We propose a mutual information- based registration algorithm to establish the proper align- ment. For optimization purposes, we compare the perfor- mance of the non-gradient Powell method and two slightly di erent versions of a stochastic gradient ascent strategy: one using a sparsely sampled histogramming approach and the other Parzen windowing to carry out probability density approximation. Our main contribution lies in adopting the stochastic ap- proximation scheme successfully applied in 3D-3D registra- tion problems to the 2D-3D scenario, which obviates the need for the generation of full DRRs at each iteration of pose op- timization. This facilitates a considerable savings in compu- tation expense. We also introduce a new probability density estimator for image intensities via sparse histogramming, de- rive gradient estimates for the density measures required by the maximization procedure and introduce the framework for a multiresolution strategy to the problem. Registration results are presented on uoroscopy and CT datasets of a plastic pelvis and a real skull, and on a high-resolution CT- derived simulated dataset of a real skull, a plastic skull, a plastic pelvis and a plastic lumbar spine segment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects' 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The 'object model', according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The processes underlying the perceptual analysis of visual form are believed to have minimal interaction with those subserving the perception of visual motion (Livingstone and Hubel, 1987; Victor and Conte, 1990). Recent reports of functionally and anatomically segregated parallel streams in the primate visual cortex seem to support this hypothesis (Ungerlieder and Mishkin, 1982; VanEssen and Maunsell, 1983; Shipp and Zeki, 1985; Zeki and Shipp, 1988; De Yoe et al., 1994). Here we present perceptual evidence that is at odds with this view and instead suggests strong symmetric interactions between the form and motion processes. In one direction, we show that the introduction of specific static figural elements, say 'F', in a simple motion sequence biases an observer to perceive a particular motion field, say 'M'. In the reverse direction, the imposition of the same motion field 'M' on the original sequence leads the observer to perceive illusory static figural elements 'F'. A specific implication of these findings concerns the possible existence of (what we call) motion end-stopped units in the primate visual system. Such units might constitute part of a mechanism for signalling subjective occluding contours based on motion-field discontinuities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The conceptual component of this work is about "reference surfaces'' which are the dual of reference frames often used for shape representation purposes. The theoretical component of this work involves the question of whether one can find a unique (and simple) mapping that aligns two arbitrary perspective views of an opaque textured quadric surface in 3D, given (i) few corresponding points in the two views, or (ii) the outline conic of the surface in one view (only) and few corresponding points in the two views. The practical component of this work is concerned with applying the theoretical results as tools for the task of achieving full correspondence between views of arbitrary objects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many 3D objects in the world around us are strongly constrained. For instance, not only cultural artifacts but also many natural objects are bilaterally symmetric. Thoretical arguments suggest and psychophysical experiments confirm that humans may be better in the recognition of symmetric objects. The hypothesis of symmetry-induced virtual views together with a network model that successfully accounts for human recognition of generic 3D objects leads to predictions that we have verified with psychophysical experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In order to estimate the motion of an object, the visual system needs to combine multiple local measurements, each of which carries some degree of ambiguity. We present a model of motion perception whereby measurements from different image regions are combined according to a Bayesian estimator --- the estimated motion maximizes the posterior probability assuming a prior favoring slow and smooth velocities. In reviewing a large number of previously published phenomena we find that the Bayesian estimator predicts a wide range of psychophysical results. This suggests that the seemingly complex set of illusions arise from a single computational strategy that is optimal under reasonable assumptions.