10 resultados para cameras and camera accessories

em Massachusetts Institute of Technology


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Enhanced reality visualization is the process of enhancing an image by adding to it information which is not present in the original image. A wide variety of information can be added to an image ranging from hidden lines or surfaces to textual or iconic data about a particular part of the image. Enhanced reality visualization is particularly well suited to neurosurgery. By rendering brain structures which are not visible, at the correct location in an image of a patient's head, the surgeon is essentially provided with X-ray vision. He can visualize the spatial relationship between brain structures before he performs a craniotomy and during the surgery he can see what's under the next layer before he cuts through. Given a video image of the patient and a three dimensional model of the patient's brain the problem enhanced reality visualization faces is to render the model from the correct viewpoint and overlay it on the original image. The relationship between the coordinate frames of the patient, the patient's internal anatomy scans and the image plane of the camera observing the patient must be established. This problem is closely related to the camera calibration problem. This report presents a new approach to finding this relationship and develops a system for performing enhanced reality visualization in a surgical environment. Immediately prior to surgery a few circular fiducials are placed near the surgical site. An initial registration of video and internal data is performed using a laser scanner. Following this, our method is fully automatic, runs in nearly real-time, is accurate to within a pixel, allows both patient and camera motion, automatically corrects for changes to the internal camera parameters (focal length, focus, aperture, etc.) and requires only a single image.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a simple method for internal camera calibration for computer vision. This method is based on tracking image features through a sequence of images while the camera undergoes pure rotation. The location of the features relative to the camera or to each other need not be known and therefore this method can be used both for laboratory calibration and for self calibration in autonomous robots working in unstructured environments. A second method of calibration is also presented. This method uses simple geometric objects such as spheres and straight lines to The camera parameters. Calibration is performed using both methods and the results compared.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis examines a complete design framework for a real-time, autonomous system with specialized VLSI hardware for computing 3-D camera motion. In the proposed architecture, the first step is to determine point correspondences between two images. Two processors, a CCD array edge detector and a mixed analog/digital binary block correlator, are proposed for this task. The report is divided into three parts. Part I covers the algorithmic analysis; part II describes the design and test of a 32$\time $32 CCD edge detector fabricated through MOSIS; and part III compares the design of the mixed analog/digital correlator to a fully digital implementation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the differences --- conceptually and algorithmically --- between affine and projective frameworks for the tasks of visual recognition and reconstruction from perspective views. It is shown that an affine invariant exists between any view and a fixed view chosen as a reference view. This implies that for tasks for which a reference view can be chosen, such as in alignment schemes for visual recognition, projective invariants are not really necessary. We then use the affine invariant to derive new algebraic connections between perspective views. It is shown that three perspective views of an object are connected by certain algebraic functions of image coordinates alone (no structure or camera geometry needs to be involved).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the linear degeneracies of projective structure estimation from point and line features across three views. We show that the rank of the linear system of equations for recovering the trilinear tensor of three views reduces to 23 (instead of 26) in the case when the scene is a Linear Line Complex (set of lines in space intersecting at a common line) and is 21 when the scene is planar. The LLC situation is only linearly degenerate, and we show that one can obtain a unique solution when the admissibility constraints of the tensor are accounted for. The line configuration described by an LLC, rather than being some obscure case, is in fact quite typical. It includes, as a particular example, the case of a camera moving down a hallway in an office environment or down an urban street. Furthermore, an LLC situation may occur as an artifact such as in direct estimation from spatio-temporal derivatives of image brightness. Therefore, an investigation into degeneracies and their remedy is important also in practice.