15 resultados para Unresolved vision problem

em Massachusetts Institute of Technology


Relevância:

90.00% 90.00%

Publicador:

Resumo:

In many motion-vision scenarios, a camera (mounted on a moving vehicle) takes images of an environment to find the "motion'' and shape. We introduce a direct-method called fixation for solving this motion-vision problem in its general case. Fixation uses neither feature-correspondence nor optical-flow. Instead, spatio-temporal brightness gradients are used directly. In contrast to previous direct methods, fixation does not restrict the motion or the environment. Moreover, fixation method neither requires tracked images as its input nor uses mechanical tracking for obtaining fixated images. The experimental results on real images are presented and the implementation issues and techniques are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis takes an interdisciplinary approach to the study of color vision, focussing on the phenomenon of color constancy formulated as a computational problem. The primary contributions of the thesis are (1) the demonstration of a formal framework for lightness algorithms; (2) the derivation of a new lightness algorithm based on regularization theory; (3) the synthesis of an adaptive lightness algorithm using "learning" techniques; (4) the development of an image segmentation algorithm that uses luminance and color information to mark material boundaries; and (5) an experimental investigation into the cues that human observers use to judge the color of the illuminant. Other computational approaches to color are reviewed and some of their links to psychophysics and physiology are explored.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To use a world model, a mobile robot must be able to determine its own position in the world. To support truly autonomous navigation, I present MARVEL, a system that builds and maintains its own models of world locations and uses these models to recognize its world position from stereo vision input. MARVEL is designed to be robust with respect to input errors and to respond to a gradually changing world by updating its world location models. I present results from real-world tests of the system that demonstrate its reliability. MARVEL fits into a world modeling system under development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Surface (Lambertain) color is a useful visual cue for analyzing material composition of scenes. This thesis adopts a signal processing approach to color vision. It represents color images as fields of 3D vectors, from which we extract region and boundary information. The first problem we face is one of secondary imaging effects that makes image color different from surface color. We demonstrate a simple but effective polarization based technique that corrects for these effects. We then propose a systematic approach of scalarizing color, that allows us to augment classical image processing tools and concepts for multi-dimensional color signals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Enhanced reality visualization is the process of enhancing an image by adding to it information which is not present in the original image. A wide variety of information can be added to an image ranging from hidden lines or surfaces to textual or iconic data about a particular part of the image. Enhanced reality visualization is particularly well suited to neurosurgery. By rendering brain structures which are not visible, at the correct location in an image of a patient's head, the surgeon is essentially provided with X-ray vision. He can visualize the spatial relationship between brain structures before he performs a craniotomy and during the surgery he can see what's under the next layer before he cuts through. Given a video image of the patient and a three dimensional model of the patient's brain the problem enhanced reality visualization faces is to render the model from the correct viewpoint and overlay it on the original image. The relationship between the coordinate frames of the patient, the patient's internal anatomy scans and the image plane of the camera observing the patient must be established. This problem is closely related to the camera calibration problem. This report presents a new approach to finding this relationship and develops a system for performing enhanced reality visualization in a surgical environment. Immediately prior to surgery a few circular fiducials are placed near the surgical site. An initial registration of video and internal data is performed using a laser scanner. Following this, our method is fully automatic, runs in nearly real-time, is accurate to within a pixel, allows both patient and camera motion, automatically corrects for changes to the internal camera parameters (focal length, focus, aperture, etc.) and requires only a single image.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Segmentation of medical imagery is a challenging problem due to the complexity of the images, as well as to the absence of models of the anatomy that fully capture the possible deformations in each structure. Brain tissue is a particularly complex structure, and its segmentation is an important step for studies in temporal change detection of morphology, as well as for 3D visualization in surgical planning. In this paper, we present a method for segmentation of brain tissue from magnetic resonance images that is a combination of three existing techniques from the Computer Vision literature: EM segmentation, binary morphology, and active contour models. Each of these techniques has been customized for the problem of brain tissue segmentation in a way that the resultant method is more robust than its components. Finally, we present the results of a parallel implementation of this method on IBM's supercomputer Power Visualization System for a database of 20 brain scans each with 256x256x124 voxels and validate those against segmentations generated by neuroanatomy experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Humans distinguish materials such as metal, plastic, and paper effortlessly at a glance. Traditional computer vision systems cannot solve this problem at all. Recognizing surface reflectance properties from a single photograph is difficult because the observed image depends heavily on the amount of light incident from every direction. A mirrored sphere, for example, produces a different image in every environment. To make matters worse, two surfaces with different reflectance properties could produce identical images. The mirrored sphere simply reflects its surroundings, so in the right artificial setting, it could mimic the appearance of a matte ping-pong ball. Yet, humans possess an intuitive sense of what materials typically "look like" in the real world. This thesis develops computational algorithms with a similar ability to recognize reflectance properties from photographs under unknown, real-world illumination conditions. Real-world illumination is complex, with light typically incident on a surface from every direction. We find, however, that real-world illumination patterns are not arbitrary. They exhibit highly predictable spatial structure, which we describe largely in the wavelet domain. Although they differ in several respects from the typical photographs, illumination patterns share much of the regularity described in the natural image statistics literature. These properties of real-world illumination lead to predictable image statistics for a surface with given reflectance properties. We construct a system that classifies a surface according to its reflectance from a single photograph under unknown illuminination. Our algorithm learns relationships between surface reflectance and certain statistics computed from the observed image. Like the human visual system, we solve the otherwise underconstrained inverse problem of reflectance estimation by taking advantage of the statistical regularity of illumination. For surfaces with homogeneous reflectance properties and known geometry, our system rivals human performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper considers the problem of language change. Linguists must explain not only how languages are learned but also how and why they have evolved along certain trajectories and not others. While the language learning problem has focused on the behavior of individuals and how they acquire a particular grammar from a class of grammars ${cal G}$, here we consider a population of such learners and investigate the emergent, global population characteristics of linguistic communities over several generations. We argue that language change follows logically from specific assumptions about grammatical theories and learning paradigms. In particular, we are able to transform parameterized theories and memoryless acquisition algorithms into grammatical dynamical systems, whose evolution depicts a population's evolving linguistic composition. We investigate the linguistic and computational consequences of this model, showing that the formalization allows one to ask questions about diachronic that one otherwise could not ask, such as the effect of varying initial conditions on the resulting diachronic trajectories. From a more programmatic perspective, we give an example of how the dynamical system model for language change can serve as a way to distinguish among alternative grammatical theories, introducing a formal diachronic adequacy criterion for linguistic theories.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In order to estimate the motion of an object, the visual system needs to combine multiple local measurements, each of which carries some degree of ambiguity. We present a model of motion perception whereby measurements from different image regions are combined according to a Bayesian estimator --- the estimated motion maximizes the posterior probability assuming a prior favoring slow and smooth velocities. In reviewing a large number of previously published phenomena we find that the Bayesian estimator predicts a wide range of psychophysical results. This suggests that the seemingly complex set of illusions arise from a single computational strategy that is optimal under reasonable assumptions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ "featurally" are much easier to distinguish when inverted than those that differ "configurally" (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) ??finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjects' expectations, there is no difference between "featurally" and "configurally" transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a component based person detection system that is capable of detecting frontal, rear and near side views of people, and partially occluded persons in cluttered scenes. The framework that is described here for people is easily applied to other objects as well. The motivation for developing a component based approach is two fold: first, to enhance the performance of person detection systems on frontal and rear views of people and second, to develop a framework that directly addresses the problem of detecting people who are partially occluded or whose body parts blend in with the background. The data classification is handled by several support vector machine classifiers arranged in two layers. This architecture is known as Adaptive Combination of Classifiers (ACC). The system performs very well and is capable of detecting people even when all components of a person are not found. The performance of the system is significantly better than a full body person detector designed along similar lines. This suggests that the improved performance is due to the components based approach and the ACC data classification structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report outlines the problem of intelligent failure recovery in a problem-solver for electrical design. We want our problem solver to learn as much as it can from its mistakes. Thus we cast the engineering design process on terms of Problem Solving by Debugging Almost-Right Plans, a paradigm for automatic problem solving based on the belief that creation and removal of "bugs" is an unavoidable part of the process of solving a complex problem. The process of localization and removal of bugs called for by the PSBDARP theory requires an approach to engineering analysis in which every result has a justification which describes the exact set of assumptions it depends upon. We have developed a program based on Analysis by Propagation of Constraints which can explain the basis of its deductions. In addition to being useful to a PSBDARP designer, these justifications are used in Dependency-Directed Backtracking to limit the combinatorial search in the analysis routines. Although the research we will describe is explicitly about electrical circuits, we believe that similar principles and methods are employed by other kinds of engineers, including computer programmers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents there important results in visual object recognition based on shape. (1) A new algorithm (RAST; Recognition by Adaptive Sudivisions of Tranformation space) is presented that has lower average-case complexity than any known recognition algorithm. (2) It is shown, both theoretically and empirically, that representing 3D objects as collections of 2D views (the "View-Based Approximation") is feasible and affects the reliability of 3D recognition systems no more than other commonly made approximations. (3) The problem of recognition in cluttered scenes is considered from a Bayesian perspective; the commonly-used "bounded-error errorsmeasure" is demonstrated to correspond to an independence assumption. It is shown that by modeling the statistical properties of real-scenes better, objects can be recognized more reliably.