22 resultados para Binocular stereo


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach proposed here is to let the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene. We illustrate the use of this representation for encoding shape information. We show how this representation has appealing properties such as locality and slow variation across space and scale. These properties provide a way of improving shape estimates coming from other sources of information like stereo.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (e.g., albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from bandpassed image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This report explores the relation between image intensity and object shape. It is shown that image intensity is related to surface orientation and that a variation in image intensity is related to surface curvature. Computational methods are developed which use the measured intensity variation across surfaces of smooth objects to determine surface orientation. In general, surface orientation is not determined locally by the intensity value recorded at each image point. Tools are needed to explore the problem of determining surface orientation from image intensity. The notion of gradient space , popularized by Huffman and Mackworth, is used to represent surface orientation. The notion of a reflectance map, originated by Horn, is used to represent the relation between surface orientation image intensity. The image Hessian is defined and used to represent surface curvature. Properties of surface curvature are expressed as constraints on possible surface orientations corresponding to a given image point. Methods are presented which embed assumptions about surface curvature in algorithms for determining surface orientation from the intensities recorded in a single view. If additional images of the same object are obtained by varying the direction of incident illumination, then surface orientation is determined locally by the intensity values recorded at each image point. This fact is exploited in a new technique called photometric stereo. The visual inspection of surface defects in metal castings is considered. Two casting applications are discussed. The first is the precision investment casting of turbine blades and vanes for aircraft jet engines. In this application, grain size is an important process variable. The existing industry standard for estimating the average grain size of metals is implemented and demonstrated on a sample turbine vane. Grain size can be computed form the measurements obtained in an image, once the foreshortening effects of surface curvature are accounted for. The second is the green sand mold casting of shuttle eyes for textile looms. Here, physical constraints inherent to the casting process translate into these constraints, it is necessary to interpret features of intensity as features of object shape. Both applications demonstrate that successful visual inspection requires the ability to interpret observed changes in intensity in the context of surface topography. The theoretical tools developed in this report provide a framework for this interpretation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the properties of feedforward neural networks trained with Hebbian learning algorithms. A new unsupervised algorithm is proposed which produces statistically uncorrelated outputs. The algorithm causes the weights of the network to converge to the eigenvectors of the input correlation with largest eigenvalues. The algorithm is closely related to the technique of Self-supervised Backpropagation, as well as other algorithms for unsupervised learning. Applications of the algorithm to texture processing, image coding, and stereo depth edge detection are given. We show that the algorithm can lead to the development of filters qualitatively similar to those found in primate visual cortex.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges and junctions may provide a 3D model of the scene but it will not inform about the actual "size" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, this is computationally complex due to the difficulty of the object recognition process. Here we propose a source of information for absolute depth estimation that does not rely on specific objects: we introduce a procedure for absolute depth estimation based on the recognition of the whole scene. The shape of the space of the scene and the structures present in the scene are strongly related to the scale of observation. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene, and therefore its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.