977 resultados para Image Interpretation
Resumo:
This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.
Resumo:
The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced techniques that are applicable under restricted conditions but simpler. The approach exploits image transformations that are specific to the relevant object class and learnable from example views of other "prototypical" objects of the same class. In this paper, we introduce such a new technique by extending the notion of linear class first proposed by Poggio and Vetter. For linear object classes it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively "rotate" high- resolution face images from a single 2D view.
Resumo:
This paper describes a machine vision system that classifies reflectance properties of surfaces such as metal, plastic, or paper, under unknown real-world illumination. We demonstrate performance of our algorithm for surfaces of arbitrary geometry. Reflectance estimation under arbitrary omnidirectional illumination proves highly underconstrained. Our reflectance estimation algorithm succeeds by learning relationships between surface reflectance and certain statistics computed from an observed image, which depend on statistical regularities in the spatial structure of real-world illumination. Although the algorithm assumes known geometry, its statistical nature makes it robust to inaccurate geometry estimates.
Resumo:
This is a collection of data on the construction operation and performance of the two image dissector cameras. Some of this data is useful in deciding whether certain shortcomings are significant for a given application and if so how to compensate for them.
Resumo:
We present an algorithm that uses multiple cues to recover shading and reflectance intrinsic images from a single image. Using both color information and a classifier trained to recognize gray-scale patterns, each image derivative is classified as being caused by shading or a change in the surface's reflectance. Generalized Belief Propagation is then used to propagate information from areas where the correct classification is clear to areas where it is ambiguous. We also show results on real images.
Resumo:
The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (e.g., albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from bandpassed image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.
Resumo:
Binary image classifiction is a problem that has received much attention in recent years. In this paper we evaluate a selection of popular techniques in an effort to find a feature set/ classifier combination which generalizes well to full resolution image data. We then apply that system to images at one-half through one-sixteenth resolution, and consider the corresponding error rates. In addition, we further observe generalization performance as it depends on the number of training images, and lastly, compare the system's best error rates to that of a human performing an identical classification task given teh same set of test images.
Resumo:
We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images.
Resumo:
We describe a program called SketchIT capable of producing multiple families of designs from a single sketch. The program is given a rough sketch (drawn using line segments for part faces and icons for springs and kinematic joints) and a description of the desired behavior. The sketch is "rough" in the sense that taken literally, it may not work. From this single, perhaps flawed sketch and the behavior description, the program produces an entire family of working designs. The program also produces design variants, each of which is itself a family of designs. SketchIT represents each family of designs with a "behavior ensuring parametric model" (BEP-Model), a parametric model augmented with a set of constraints that ensure the geometry provides the desired behavior. The construction of the BEP-Model from the sketch and behavior description is the primary task and source of difficulty in this undertaking. SketchIT begins by abstracting the sketch to produce a qualitative configuration space (qc-space) which it then uses as its primary representation of behavior. SketchIT modifies this initial qc-space until qualitative simulation verifies that it produces the desired behavior. SketchIT's task is then to find geometries that implement this qc-space. It does this using a library of qc-space fragments. Each fragment is a piece of parametric geometry with a set of constraints that ensure the geometry implements a specific kind of boundary (qcs-curve) in qc-space. SketchIT assembles the fragments to produce the BEP-Model. SketchIT produces design variants by mapping the qc-space to multiple implementations, and by transforming rotating parts to translating parts and vice versa.
Resumo:
The problems under consideration center around the interpretation of binocular stereo disparity. In particular, the goal is to establish a set of mappings from stereo disparity to corresponding three-dimensional scene geometry. An analysis has been developed that shows how disparity information can be interpreted in terms of three-dimensional scene properties, such as surface depth, discontinuities, and orientation. These theoretical developments have been embodied in a set of computer algorithms for the recovery of scene geometry from input stereo disparity. The results of applying these algorithms to several disparity maps are presented. Comparisons are made to the interpretation of stereo disparity by biological systems.
Resumo:
This thesis addresses the problem of recognizing solid objects in the three-dimensional world, using two-dimensional shape information extracted from a single image. Objects can be partly occluded and can occur in cluttered scenes. A model based approach is taken, where stored models are matched to an image. The matching problem is separated into two stages, which employ different representations of objects. The first stage uses the smallest possible number of local features to find transformations from a model to an image. This minimizes the amount of search required in recognition. The second stage uses the entire edge contour of an object to verify each transformation. This reduces the chance of finding false matches.
Resumo:
This report describes a paradigm for combining associational and causal reasoning to achieve efficient and robust problem-solving behavior. The Generate, Test and Debug (GTD) paradigm generates initial hypotheses using associational (heuristic) rules. The tester verifies hypotheses, supplying the debugger with causal explanations for bugs found if the test fails. The debugger uses domain-independent causal reasoning techniques to repair hypotheses, analyzing domain models and the causal explanations produced by the tester to determine how to replace faulty assumptions made by the generator. We analyze the strengths and weaknesses of associational and causal reasoning techniques, and present a theory of debugging plans and interpretations. The GTD paradigm has been implemented and tested in the domains of geologic interpretation, the blocks world, and Tower of Hanoi problems.
Resumo:
Rapid judgments about the properties and spatial relations of objects are the crux of visually guided interaction with the world. Vision begins, however, with essentially pointwise representations of the scene, such as arrays of pixels or small edge fragments. For adequate time-performance in recognition, manipulation, navigation, and reasoning, the processes that extract meaningful entities from the pointwise representations must exploit parallelism. This report develops a framework for the fast extraction of scene entities, based on a simple, local model of parallel computation.sAn image chunk is a subset of an image that can act as a unit in the course of spatial analysis. A parallel preprocessing stage constructs a variety of simple chunks uniformly over the visual array. On the basis of these chunks, subsequent serial processes locate relevant scene components and assemble detailed descriptions of them rapidly. This thesis defines image chunks that facilitate the most potentially time-consuming operations of spatial analysis---boundary tracing, area coloring, and the selection of locations at which to apply detailed analysis. Fast parallel processes for computing these chunks from images, and chunk-based formulations of indexing, tracing, and coloring, are presented. These processes have been simulated and evaluated on the lisp machine and the connection machine.
Resumo:
Geologic interpretation is the task of inferring a sequence of events to explain how a given geologic region could have been formed. This report describes the design and implementation of one part of a geologic interpretation problem solver -- a system which uses a simulation technique called imagining to check the validity of a candidate sequence of events. Imagining uses a combination of qualitative and quantitative simulations to reason about the changes which occured to the geologic region. The spatial changes which occur are simulated by constructing a sequence of diagrams. The quantitative simulation needs numeric parameters which are determined by using the qualitative simulation to establish the cumulative changes to an object and by using a description of the current geologic region to make quantitative measurements. The diversity of reasoning skills used in imagining has necessitated the development of multiple representations, each specialized for a different task. Representations to facilitate doing temporal, spatial and numeric reasoning are described in detail. We have also found it useful to explicitly represent processes. Both the qualitative and quantitative simulations use a discrete 'layer cake' model of geologic processes, but each uses a separate representation, specialized to support the type of simulation. These multiple representations have enabled us to develop a powerful, yet modular, system for reasoning about change.
Resumo:
How much information about the shape of an object can be inferred from its image? In particular, can the shape of an object be reconstructed by measuring the light it reflects from points on its surface? These questions were raised by Horn [HO70] who formulated a set of conditions such that the image formation can be described in terms of a first order partial differential equation, the image irradiance equation. In general, an image irradiance equation has infinitely many solutions. Thus constraints necessary to find a unique solution need to be identified. First we study the continuous image irradiance equation. It is demonstrated when and how the knowledge of the position of edges on a surface can be used to reconstruct the surface. Furthermore we show how much about the shape of a surface can be deduced from so called singular points. At these points the surface orientation is uniquely determined by the measured brightness. Then we investigate images in which certain types of silhouettes, which we call b-silhouettes, can be detected. In particular we answer the following question in the affirmative: Is there a set of constraints which assure that if an image irradiance equation has a solution, it is unique? To this end we postulate three constraints upon the image irradiance equation and prove that they are sufficient to uniquely reconstruct the surface from its image. Furthermore it is shown that any two of these constraints are insufficient to assure a unique solution to an image irradiance equation. Examples are given which illustrate the different issues. Finally, an overview of known numerical methods for computing solutions to an image irradiance equation are presented.