Biblioteca Digital

997 resultados para Aisberg-2004-10

A Binocular, Foveated Active Vision System

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This report documents the design and implementation of a binocular, foveated active vision system as part of the Cog project at the MIT Artificial Intelligence Laboratory. The active vision system features a three degree of freedom mechanical platform that supports four color cameras, a motion control system, and a parallel network of digital signal processors for image processing. To demonstrate the capabilities of the system, we present results from four sample visual-motor tasks.

Recovery of Three-Dimensional Objects from Single Perspective Images

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Any three-dimensional wire-frame object constructed out of parallelograms can be recovered from a single perspective two-dimensional image. A procedure for performing the recovery is given.

Coordinate-Independent Computations on Differential Equations

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This project investigates the computational representation of differentiable manifolds, with the primary goal of solving partial differential equations using multiple coordinate systems on general n- dimensional spaces. In the process, this abstraction is used to perform accurate integrations of ordinary differential equations using multiple coordinate systems. In the case of linear partial differential equations, however, unexpected difficulties arise even with the simplest equations.

Sparse Representations for Fast, One-Shot Learning

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Humans rapidly and reliably learn many kinds of regularities and generalizations. We propose a novel model of fast learning that exploits the properties of sparse representations and the constraints imposed by a plausible hardware mechanism. To demonstrate our approach we describe a computational model of acquisition in the domain of morphophonology. We encapsulate phonological information as bidirectional boolean constraint relations operating on the classical linguistic representations of speech sounds in term of distinctive features. The performance model is described as a hardware mechanism that incrementally enforces the constraints. Phonological behavior arises from the action of this mechanism. Constraints are induced from a corpus of common English nouns and verbs. The induction algorithm compiles the corpus into increasingly sophisticated constraints. The algorithm yields one-shot learning from a few examples. Our model has been implemented as a computer program. The program exhibits phonological behavior similar to that of young children. As a bonus the constraints that are acquired can be interpreted as classical linguistic rules.

Restructuring Sparse High Dimensional Data for Effective Retrieval

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents.

Separating Reflections from Images Using Independent Components Analysis

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The image of an object can vary dramatically depending on lighting, specularities/reflections and shadows. It is often advantageous to separate these incidental variations from the intrinsic aspects of an image. Along these lines this paper describes a method for photographing objects behind glass and digitally removing the reflections off the glass leaving the image of the objects behind the glass intact. We describe the details of this method which employs simple optical techniques and independent components analysis (ICA) and show its efficacy with several examples.

An Accelerated Chow and Liu Algorithm: Fitting Tree Distributions to High Dimensional Sparse Data

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Chow and Liu introduced an algorithm for fitting a multivariate distribution with a tree (i.e. a density model that assumes that there are only pairwise dependencies between variables) and that the graph of these dependencies is a spanning tree. The original algorithm is quadratic in the dimesion of the domain, and linear in the number of data points that define the target distribution $P$. This paper shows that for sparse, discrete data, fitting a tree distribution can be done in time and memory that is jointly subquadratic in the number of variables and the size of the data set. The new algorithm, called the acCL algorithm, takes advantage of the sparsity of the data to accelerate the computation of pairwise marginals and the sorting of the resulting mutual informations, achieving speed ups of up to 2-3 orders of magnitude in the experiments.

Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.

Detecting Digital Forgeries Using Bispectral Analysis

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the rapid increase in low-cost and sophisticated digital technology the need for techniques to authenticate digital material will become more urgent. In this paper we address the problem of authenticating digital signals assuming no explicit prior knowledge of the original. The basic approach that we take is to assume that in the frequency domain a "natural" signal has weak higher-order statistical correlations. We then show that "un-natural" correlations are introduced if this signal is passed through a non-linearity (which would almost surely occur in the creation of a forgery). Techniques from polyspectral analysis are then used to detect the presence of these correlations. We review the basics of polyspectral analysis, show how and why these tools can be used in detecting forgeries and show their effectiveness in analyzing human speech.

Stable Mixing of Complete and Incomplete Information

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An increasing number of parameter estimation tasks involve the use of at least two information sources, one complete but limited, the other abundant but incomplete. Standard algorithms such as EM (or em) used in this context are unfortunately not stable in the sense that they can lead to a dramatic loss of accuracy with the inclusion of incomplete observations. We provide a more controlled solution to this problem through differential equations that govern the evolution of locally optimal solutions (fixed points) as a function of the source weighting. This approach permits us to explicitly identify any critical (bifurcation) points leading to choices unsupported by the available complete data. The approach readily applies to any graphical model in O(n^3) time where n is the number of parameters. We use the naive Bayes model to illustrate these ideas and demonstrate the effectiveness of our approach in the context of text classification problems.

Simplifying transformations for type-alpha certificates

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an algorithm for simplifying NDL deductions. An array of simplifying transformations are rigorously defined. They are shown to be terminating, and to respect the formal semantis of the language. We also show that the transformations never increase the size or complexity of a deduction---in the worst case, they produce deductions of the same size and complexity as the original. We present several examples of proofs containing various types of "detours", and explain how our procedure eliminates them, resulting in smaller and cleaner deductions. All of the given transformations are fully implemented in SML-NJ. The complete code listing is presented, along with explanatory comments. Finally, although the transformations given here are defined for NDL, we point out that they can be applied to any type-alpha DPL that satisfies a few simple conditions.

PDP-6 LISP (LISP 1.5) Revised

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This is a mosaic description of PDP-6 LISP, intended for readers familiar with the LISP 1.5 Programmer's Manual or who have used LISP on some other computer. Many of the features, such as the display, are subject to change. Thus, consult a PDP-6 system programmer for any differences which may exist between LISP of Oct. 14, 1966 and present LISP on the system tape.

Exploring Vision-Based Interfaces: How to Use Your Head in Dual Pointing Tasks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The utility of vision-based face tracking for dual pointing tasks is evaluated. We first describe a 3-D face tracking technique based on real-time parametric motion-stereo, which is non-invasive, robust, and self-initialized. The tracker provides a real-time estimate of a ?frontal face ray? whose intersection with the display surface plane is used as a second stream of input for scrolling or pointing, in paral-lel with hand input. We evaluated the performance of com-bined head/hand input on a box selection and coloring task: users selected boxes with one pointer and colors with a second pointer, or performed both tasks with a single pointer. We found that performance with head and one hand was intermediate between single hand performance and dual hand performance. Our results are consistent with previously reported dual hand conflict in symmetric pointing tasks, and suggest that a head-based input stream should be used for asymmetric control.

Shape-Time Photography

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We introduce a new method to describe, in a single image, changes in shape over time. We acquire both range and image information with a stationary stereo camera. From the pictures taken, we display a composite image consisting of the image data from the surface closest to the camera at every pixel. This reveals the 3-d relationships over time by easy-to-interpret occlusion relationships in the composite image. We call the composite a shape-time photograph. Small errors in depth measurements cause artifacts in the shape-time images. We correct most of these using a Markov network to estimate the most probable front surface, taking into account the depth measurements, their uncertainties, and layer continuity assumptions.

The Image Dissector "Eyes"

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This is a collection of data on the construction operation and performance of the two image dissector cameras. Some of this data is useful in deciding whether certain shortcomings are significant for a given application and if so how to compensate for them.

«
1
2
...
51
52
53
54
55
56
57
...
66
67
»