14 resultados para image motion analysis

em Massachusetts Institute of Technology


Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present psychophysical experiments that measure the accuracy of perceived 3D structure derived from relative image motion. The experiments are motivated by Ullman's incremental rigidity scheme, which builds up 3D structure incrementally over an extended time. Our main conclusions are: first, the human system derives an accurate model of the relative depths of moving points, even in the presence of noise; second, the accuracy of 3D structure improves with time, eventually reaching a plateau; and third, the 3D structure currently perceived depends on previous 3D models. Through computer simulations, we relate the psychophysical observations to the behavior of Ullman's model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We address the computational role that the construction of a complete surface representation may play in the recovery of 3--D structure from motion. We present a model that combines a feature--based structure--from- -motion algorithm with smooth surface interpolation. This model can represent multiple surfaces in a given viewing direction, incorporates surface constraints from object boundaries, and groups image features using their 2--D image motion. Computer simulations relate the model's behavior to perceptual observations. In a companion paper, we discuss further perceptual experiments regarding the role of surface reconstruction in the human recovery of 3--D structure from motion.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Rapid judgments about the properties and spatial relations of objects are the crux of visually guided interaction with the world. Vision begins, however, with essentially pointwise representations of the scene, such as arrays of pixels or small edge fragments. For adequate time-performance in recognition, manipulation, navigation, and reasoning, the processes that extract meaningful entities from the pointwise representations must exploit parallelism. This report develops a framework for the fast extraction of scene entities, based on a simple, local model of parallel computation.sAn image chunk is a subset of an image that can act as a unit in the course of spatial analysis. A parallel preprocessing stage constructs a variety of simple chunks uniformly over the visual array. On the basis of these chunks, subsequent serial processes locate relevant scene components and assemble detailed descriptions of them rapidly. This thesis defines image chunks that facilitate the most potentially time-consuming operations of spatial analysis---boundary tracing, area coloring, and the selection of locations at which to apply detailed analysis. Fast parallel processes for computing these chunks from images, and chunk-based formulations of indexing, tracing, and coloring, are presented. These processes have been simulated and evaluated on the lisp machine and the connection machine.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple localized image features such as moments extracted from orthogonal view video silhouettes of human walking motion. A suite of time-integration methods, spanning a range of coarseness of time aggregation and modeling of feature distributions, are applied to these image features to create a suite of gait sequence representations. Despite their simplicity, the resulting feature vectors contain enough information to perform well on human identification and gender classification tasks. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times and under varying lighting environments. Each of the integration methods are investigated for their advantages and disadvantages. An improved gait representation is built based on our experiences with the initial set of gait representations. In addition, we show gender classification results using our gait appearance features, the effect of our heuristic feature selection method, and the significance of individual features.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Image analysis and graphics synthesis can be achieved with learning techniques using directly image examples without physically-based, 3D models. In our technique: -- the mapping from novel images to a vector of "pose" and "expression" parameters can be learned from a small set of example images using a function approximation technique that we call an analysis network; -- the inverse mapping from input "pose" and "expression" parameters to output images can be synthesized from a small set of example images and used to produce new images using a similar synthesis network. The techniques described here have several applications in computer graphics, special effects, interactive multimedia and very low bandwidth teleconferencing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several algorithms for optical flow are studied theoretically and experimentally. Differential and matching methods are examined; these two methods have differing domains of application- differential methods are best when displacements in the image are small (<2 pixels) while matching methods work well for moderate displacements but do not handle sub-pixel motions. Both types of optical flow algorithm can use either local or global constraints, such as spatial smoothness. Local matching and differential techniques and global differential techniques will be examined. Most algorithms for optical flow utilize weak assumptions on the local variation of the flow and on the variation of image brightness. Strengthening these assumptions improves the flow computation. The computational consequence of this is a need for larger spatial and temporal support. Global differential approaches can be extended to local (patchwise) differential methods and local differential methods using higher derivatives. Using larger support is valid when constraint on the local shape of the flow are satisfied. We show that a simple constraint on the local shape of the optical flow, that there is slow spatial variation in the image plane, is often satisfied. We show how local differential methods imply the constraints for related methods using higher derivatives. Experiments show the behavior of these optical flow methods on velocity fields which so not obey the assumptions. Implementation of these methods highlights the importance of numerical differentiation. Numerical approximation of derivatives require care, in two respects: first, it is important that the temporal and spatial derivatives be matched, because of the significant scale differences in space and time, and, second, the derivative estimates improve with larger support.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Earlier, we introduced a direct method called fixation for the recovery of shape and motion in the general case. The method uses neither feature correspondence nor optical flow. Instead, it directly employs the spatiotemporal gradients of image brightness. This work reports the experimental results of applying some of our fixation algorithms to a sequence of real images where the motion is a combination of translation and rotation. These results show that parameters such as the fization patch size have crucial effects on the estimation of some motion parameters. Some of the critical issues involved in the implementaion of our autonomous motion vision system are also discussed here. Among those are the criteria for automatic choice of an optimum size for the fixation patch, and an appropriate location for the fixation point which result in good estimates for important motion parameters. Finally, a calibration method is described for identifying the real location of the rotation axis in imaging systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Saliency Network proposed by Shashua and Ullman is a well-known approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally prefers long and smooth curves over short or wiggly ones. While computing saliencies, the network also fills in gaps with smooth completions and tolerates noise. Finally, the network is locally connected, and its size is proportional to the size of the image. Nevertheless, our analysis reveals certain weaknesses with the method. In particular, we show cases in which the most salient element does not lie on the perceptually most salient curve. Furthermore, in some cases the saliency measure changes its preferences when curves are scaled uniformly. Also, we show that for certain fragmented curves the measure prefers large gaps over a few small gaps of the same total size. In addition, we analyze the time complexity required by the method. We show that the number of steps required for convergence in serial implementations is quadratic in the size of the network, and in parallel implementations is linear in the size of the network. We discuss problems due to coarse sampling of the range of possible orientations. We show that with proper sampling the complexity of the network becomes cubic in the size of the network. Finally, we consider the possibility of using the Saliency Network for grouping. We show that the Saliency Network recovers the most salient curve efficiently, but it has problems with identifying any salient curve other than the most salient one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a new method for motion estimation and 3D reconstruction from stereo image sequences obtained by a stereo rig moving through a rigid world. We show that given two stereo pairs one can compute the motion of the stereo rig directly from the image derivatives (spatial and temporal). Correspondences are not required. One can then use the images from both pairs combined to compute a dense depth map. The motion estimates between stereo pairs enable us to combine depth maps from all the pairs in the sequence to form an extended scene reconstruction and we show results from a real image sequence. The motion computation is a linear least squares computation using all the pixels in the image. Areas with little or no contrast are implicitly weighted less so one does not have to explicitly apply a confidence measure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The image of an object can vary dramatically depending on lighting, specularities/reflections and shadows. It is often advantageous to separate these incidental variations from the intrinsic aspects of an image. Along these lines this paper describes a method for photographing objects behind glass and digitally removing the reflections off the glass leaving the image of the objects behind the glass intact. We describe the details of this method which employs simple optical techniques and independent components analysis (ICA) and show its efficacy with several examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We formulate and interpret several multi-modal registration methods in the context of a unified statistical and information theoretic framework. A unified interpretation clarifies the implicit assumptions of each method yielding a better understanding of their relative strengths and weaknesses. Additionally, we discuss a generative statistical model from which we derive a novel analysis tool, the "auto-information function", as a means of assessing and exploiting the common spatial dependencies inherent in multi-modal imagery. We analytically derive useful properties of the "auto-information" as well as verify them empirically on multi-modal imagery. Among the useful aspects of the "auto-information function" is that it can be computed from imaging modalities independently and it allows one to decompose the search space of registration problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis examines a complete design framework for a real-time, autonomous system with specialized VLSI hardware for computing 3-D camera motion. In the proposed architecture, the first step is to determine point correspondences between two images. Two processors, a CCD array edge detector and a mixed analog/digital binary block correlator, are proposed for this task. The report is divided into three parts. Part I covers the algorithmic analysis; part II describes the design and test of a 32$\time $32 CCD edge detector fabricated through MOSIS; and part III compares the design of the mixed analog/digital correlator to a fully digital implementation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Robots must plan and execute tasks in the presence of uncertainty. Uncertainty arises from sensing errors, control errors, and uncertainty in the geometry of the environment. The last, which is called model error, has received little previous attention. We present a framework for computing motion strategies that are guaranteed to succeed in the presence of all three kinds of uncertainty. The motion strategies comprise sensor-based gross motions, compliant motions, and simple pushing motions.