994 resultados para Speaker Recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non-cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Alignment is a prevalent approach for recognizing 3D objects in 2D images. A major problem with current implementations is how to robustly handle errors that propagate from uncertainties in the locations of image features. This thesis gives a technique for bounding these errors. The technique makes use of a new solution to the problem of recovering 3D pose from three matching point pairs under weak-perspective projection. Furthermore, the error bounds are used to demonstrate that using line segments for features instead of points significantly reduces the false positive rate, to the extent that alignment can remain reliable even in cluttered scenes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recognizing standard computational structures (cliches) in a program can help an experienced programmer understand the program. We develop a graph parsing approach to automating program recognition in which programs and cliches are represented in an attributed graph grammar formalism and recognition is achieved by graph parsing. In studying this approach, we evaluate our representation's ability to suppress many common forms of variation which hinder recognition. We investigate the expressiveness of our graph grammar formalism for capturing programming cliches. We empirically and analytically study the computational cost of our recognition approach with respect to two medium-sized, real-world simulator programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The primary goal of this report is to demonstrate how considerations from computational complexity theory can inform grammatical theorizing. To this end, generalized phrase structure grammar (GPSG) linguistic theory is revised so that its power more closely matches the limited ability of an ideal speaker--hearer: GPSG Recognition is EXP-POLY time hard, while Revised GPSG Recognition is NP-complete. A second goal is to provide a theoretical framework within which to better understand the wide range of existing GPSG models, embodied in formal definitions as well as in implemented computer programs. A grammar for English and an informal explanation of the GPSG/RGPSG syntactic features are included in appendices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Techniques, suitable for parallel implementation, for robust 2D model-based object recognition in the presence of sensor error are studied. Models and scene data are represented as local geometric features and robust hypothesis of feature matchings and transformations is considered. Bounds on the error in the image feature geometry are assumed constraining possible matchings and transformations. Transformation sampling is introduced as a simple, robust, polynomial-time, and highly parallel method of searching the space of transformations to hypothesize feature matchings. Key to the approach is that error in image feature measurement is explicitly accounted for. A Connection Machine implementation and experiments on real images are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis addresses the problem of recognizing solid objects in the three-dimensional world, using two-dimensional shape information extracted from a single image. Objects can be partly occluded and can occur in cluttered scenes. A model based approach is taken, where stored models are matched to an image. The matching problem is separated into two stages, which employ different representations of objects. The first stage uses the smallest possible number of local features to find transformations from a model to an image. This minimizes the amount of search required in recognition. The second stage uses the entire edge contour of an object to verify each transformation. This reduces the chance of finding false matches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The key to understanding a program is recognizing familiar algorithmic fragments and data structures in it. Automating this recognition process will make it easier to perform many tasks which require program understanding, e.g., maintenance, modification, and debugging. This report describes a recognition system, called the Recognizer, which automatically identifies occurrences of stereotyped computational fragments and data structures in programs. The Recognizer is able to identify these familiar fragments and structures, even though they may be expressed in a wide range of syntactic forms. It does so systematically and efficiently by using a parsing technique. Two important advances have made this possible. The first is a language-independent graphical representation for programs and programming structures which canonicalizes many syntactic features of programs. The second is an efficient graph parsing algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A study is made of the recognition and transformation of figures by iterative arrays of finite state automata. A figure is a finite rectangular two-dimensional array of symbols. The iterative arrays considered are also finite, rectangular, and two-dimensional. The automata comprising any given array are called cells and are assumed to be isomorphic and to operate synchronously with the state of a cell at time t+1 being a function of the states of it and its four nearest neighbors at time t. At time t=0 each cell is placed in one of a fixed number of initial states. The pattern of initial states thus introduced represents the figure to be processed. The resulting sequence of array states represents a computation based on the input figure. If one waits for a specially designated cell to indicate acceptance or rejection of the figure, the array is said to be working on a recognition problem. If one waits for the array to come to a stable configuration representing an output figure, the array is said to be working on a transformation problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An investigation is made into the problem of constructing a model of the appearance to an optical input device of scenes consisting of plane-faced geometric solids. The goal is to study algorithms which find the real straight edges in the scenes, taking into account smooth variations in intensity over faces of the solids, blurring of edges and noise. A general mathematical analysis is made of optimal methods for identifying the edge lines in figures, given a raster of intensities covering the entire field of view. There is given in addition a suboptimal statistical decision procedure, based on the model, for the identification of a line within a narrow band on the field of view given an array of intensities from within the band. A computer program has been written and extensively tested which implements this procedure and extracts lines from real scenes. Other programs were written which judge the completeness of extracted sets of lines, and propose and test for additional lines which had escaped initial detection. The performance of these programs is discussed in relation to the theory derived from the model, and with regard to their use of global information in detecting and proposing lines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A computer may gather a lot of information from its environment in an optical or graphical manner. A scene, as seen for instance from a TV camera or a picture, can be transformed into a symbolic description of points and lines or surfaces. This thesis describes several programs, written in the language CONVERT, for the analysis of such descriptions in order to recognize, differentiate and identify desired objects or classes of objects in the scene. Examples are given in each case. Although the recognition might be in terms of projections of 2-dim and 3-dim objects, we do not deal with stereoscopic information. One of our programs (Polybrick) identifies parallelepipeds in a scene which may contain partially hidden bodies and non-parallelepipedic objects. The program TD works mainly with 2-dimensional figures, although under certain conditions successfully identifies 3-dim objects. Overlapping objects are identified when they are transparent. A third program, DT, works with 3-dim and 2-dim objects, and does not identify objects which are not completely seen. Important restrictions and suppositions are: (a) the input is assumed perfect (noiseless), and in a symbolic format; (b) no perspective deformation is considered. A portion of this thesis is devoted to the study of models (symbolic representations) of the objects we want to identify; different schemes, some of them already in use, are discussed. Focusing our attention on the more general problem of identification of general objects when they substantially overlap, we propose some schemes for their recognition, and also analyze some problems that are met.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A system for visual recognition is described, with implications for the general problem of representation of knowledge to assist control. The immediate objective is a computer system that will recognize objects in a visual scene, specifically hammers. The computer receives an array of light intensities from a device like a television camera. It is to locate and identify the hammer if one is present. The computer must produce from the numerical "sensory data" a symbolic description that constitutes its perception of the scene. Of primary concern is the control of the recognition process. Control decisions should be guided by the partial results obtained on the scene. If a hammer handle is observed this should suggest that the handle is part of a hammer and advise where to look for the hammer head. The particular knowledge that a handle has been found combines with general knowledge about hammers to influence the recognition process. This use of knowledge to direct control is denoted here by the term "active knowledge". A descriptive formalism is presented for visual knowledge which identifies the relationships relevant to the active use of the knowledge. A control structure is provided which can apply knowledge organized in this fashion actively to the processing of a given scene.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Methods are presented (1) to partition or decompose a visual scene into the bodies forming it; (2) to position these bodies in three-dimensional space, by combining two scenes that make a stereoscopic pair; (3) to find the regions or zones of a visual scene that belong to its background; (4) to carry out the isolation of objects in (1) when the input has inaccuracies. Running computer programs implement the methods, and many examples illustrate their behavior. The input is a two-dimensional line-drawing of the scene, assumed to contain three-dimensional bodies possessing flat faces (polyhedra); some of them may be partially occluded. Suggestions are made for extending the work to curved objects. Some comparisons are made with human visual perception. The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (on the basis of pure form); in fact, successful methods are shown.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a theory of human-like reasoning in the general domain of designed physical systems, and in particular, electronic circuits. One aspect of the theory, causal analysis, describes how the behavior of individual components can be combined to explain the behavior of composite systems. Another aspect of the theory, teleological analysis, describes how the notion that the system has a purpose can be used to aid this causal analysis. The theory is implemented as a computer program, which, given a circuit topology, can construct by qualitative causal analysis a mechanism graph describing the functional topology of the system. This functional topology is then parsed by a grammar for common circuit functions. Ambiguities are introduced into the analysis by the approximate qualitative nature of the analysis. For example, there are often several possible mechanisms which might describe the circuit's function. These are disambiguated by teleological analysis. The requirement that each component be assigned an appropriate purpose in the functional topology imposes a severe constraint which eliminates all the ambiguities. Since both analyses are based on heuristics, the chosen mechanism is a rationalization of how the circuit functions, and does not guarantee that the circuit actually does function. This type of coarse understanding of circuits is useful for analysis, design and troubleshooting.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Molecularly imprinted polymer, exhibiting considerable enantioselectivity for L-mandelic acid, was prepared using metal coordination-chelation interaction. By evaluating the recognition characteristics in the chromatographic mode, the recognition interactions were proposed: specific and nonspecific metal coordination-chelation interaction and hydrophobic interaction were responsible for substrate binding on metal-complexing imprinted polymer; while the selective recognition only came from specific metal coordination-chelation interaction and specific hydrophobic interaction.