10 resultados para Motion-based input
em Massachusetts Institute of Technology
Resumo:
This paper presents an image-based rendering system using algebraic relations between different views of an object. The system uses pictures of an object taken from known positions. Given three such images it can generate "virtual'' ones as the object would look from any position near the ones that the two input images were taken from. The extrapolation from the example images can be up to about 60 degrees of rotation. The system is based on the trilinear constraints that bind any three view so fan object. As a side result, we propose two new methods for camera calibration. We developed and used one of them. We implemented the system and tested it on real images of objects and faces. We also show experimentally that even when only two images taken from unknown positions are given, the system can be used to render the object from other view points as long as we have a good estimate of the internal parameters of the camera used and we are able to find good correspondence between the example images. In addition, we present the relation between these algebraic constraints and a factorization method for shape and motion estimation. As a result we propose a method for motion estimation in the special case of orthographic projection.
Resumo:
The flexibility of the robot is the key to its success as a viable aid to production. Flexibility of a robot can be explained in two directions. The first is to increase the physical generality of the robot such that it can be easily reconfigured to handle a wide variety of tasks. The second direction is to increase the ability of the robot to interact with its environment such that tasks can still be successfully completed in the presence of uncertainties. The use of articulated hands are capable of adapting to a wide variety of grasp shapes, hence reducing the need for special tooling. The availability of low mass, high bandwidth points close to the manipulated object also offers significant improvements I the control of fine motions. This thesis provides a framework for using articulated hands to perform local manipulation of objects. N particular, it addresses the issues in effecting compliant motions of objects in Cartesian space. The Stanford/JPL hand is used as an example to illustrate a number of concepts. The examples provide a unified methodology for controlling articulated hands grasping with point contacts. We also present a high-level hand programming system based on the methodologies developed in this thesis. Compliant motion of grasped objects and dexterous manipulations can be easily described in the LISP-based hand programming language.
Resumo:
A new formulation for recovering the structure and motion parameters of a moving patch using both motion and shading information is presented. It is based on a new differential constraint equation (FICE) that links the spatiotemporal gradients of irradiance to the motion and structure parameters and the temporal variations of the surface shading. The FICE separates the contribution to the irradiance spatiotemporal gradients of the gradients due to texture from those due to shading and allows the FICE to be used for textured and textureless surface. The new approach, combining motion and shading information, leads directly to two different contributions: it can compensate for the effects of shading variations in recovering the shape and motion; and it can exploit the shading/illumination effects to recover motion and shape when they cannot be recovered without it. The FICE formulation is also extended to multiple frames.
Resumo:
In many motion-vision scenarios, a camera (mounted on a moving vehicle) takes images of an environment to find the "motion'' and shape. We introduce a direct-method called fixation for solving this motion-vision problem in its general case. Fixation uses neither feature-correspondence nor optical-flow. Instead, spatio-temporal brightness gradients are used directly. In contrast to previous direct methods, fixation does not restrict the motion or the environment. Moreover, fixation method neither requires tracked images as its input nor uses mechanical tracking for obtaining fixated images. The experimental results on real images are presented and the implementation issues and techniques are discussed.
Resumo:
In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.
Resumo:
We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.
Resumo:
The processes underlying the perceptual analysis of visual form are believed to have minimal interaction with those subserving the perception of visual motion (Livingstone and Hubel, 1987; Victor and Conte, 1990). Recent reports of functionally and anatomically segregated parallel streams in the primate visual cortex seem to support this hypothesis (Ungerlieder and Mishkin, 1982; VanEssen and Maunsell, 1983; Shipp and Zeki, 1985; Zeki and Shipp, 1988; De Yoe et al., 1994). Here we present perceptual evidence that is at odds with this view and instead suggests strong symmetric interactions between the form and motion processes. In one direction, we show that the introduction of specific static figural elements, say 'F', in a simple motion sequence biases an observer to perceive a particular motion field, say 'M'. In the reverse direction, the imposition of the same motion field 'M' on the original sequence leads the observer to perceive illusory static figural elements 'F'. A specific implication of these findings concerns the possible existence of (what we call) motion end-stopped units in the primate visual system. Such units might constitute part of a mechanism for signalling subjective occluding contours based on motion-field discontinuities.
Resumo:
Image analysis and graphics synthesis can be achieved with learning techniques using directly image examples without physically-based, 3D models. In our technique: -- the mapping from novel images to a vector of "pose" and "expression" parameters can be learned from a small set of example images using a function approximation technique that we call an analysis network; -- the inverse mapping from input "pose" and "expression" parameters to output images can be synthesized from a small set of example images and used to produce new images using a similar synthesis network. The techniques described here have several applications in computer graphics, special effects, interactive multimedia and very low bandwidth teleconferencing.
Resumo:
We describe the key role played by partial evaluation in the Supercomputing Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputing Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at MIT, and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable.
Resumo:
The visual recognition of complex movements and actions is crucial for communication and survival in many species. Remarkable sensitivity and robustness of biological motion perception have been demonstrated in psychophysical experiments. In recent years, neurons and cortical areas involved in action recognition have been identified in neurophysiological and imaging studies. However, the detailed neural mechanisms that underlie the recognition of such complex movement patterns remain largely unknown. This paper reviews the experimental results and summarizes them in terms of a biologically plausible neural model. The model is based on the key assumption that action recognition is based on learned prototypical patterns and exploits information from the ventral and the dorsal pathway. The model makes specific predictions that motivate new experiments.