36 resultados para Vector representation
em Massachusetts Institute of Technology
Resumo:
We derive a new representation for a function as a linear combination of local correlation kernels at optimal sparse locations and discuss its relation to PCA, regularization, sparsity principles and Support Vector Machines. We first review previous results for the approximation of a function from discrete data (Girosi, 1998) in the context of Vapnik"s feature space and dual representation (Vapnik, 1995). We apply them to show 1) that a standard regularization functional with a stabilizer defined in terms of the correlation function induces a regression function in the span of the feature space of classical Principal Components and 2) that there exist a dual representations of the regression function in terms of a regularization network with a kernel equal to a generalized correlation function. We then describe the main observation of the paper: the dual representation in terms of the correlation function can be sparsified using the Support Vector Machines (Vapnik, 1982) technique and this operation is equivalent to sparsify a large dictionary of basis functions adapted to the task, using a variation of Basis Pursuit De-Noising (Chen, Donoho and Saunders, 1995; see also related work by Donahue and Geiger, 1994; Olshausen and Field, 1995; Lewicki and Sejnowski, 1998). In addition to extending the close relations between regularization, Support Vector Machines and sparsity, our work also illuminates and formalizes the LFA concept of Penev and Atick (1996). We discuss the relation between our results, which are about regression, and the different problem of pattern classification.
Resumo:
We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a two-layer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, we trained the network to recognise ten objects from different viewpoints. The training process led to the emergence of compact representations of the specific input views. When tested on novel views of the same objects, the network exhibited a substantial generalisation capability. In simulated psychophysical experiments, the network's behavior was qualitatively similar to that of human subjects.
Resumo:
This paper describes a method for limiting vibration in flexible systems by shaping the system inputs. Unlike most previous attempts at input shaping, this method does not require an extensive system model or lengthy numerical computation; only knowledge of the system natural frequency and damping ratio are required. The effectiveness of this method when there are errors in the system model is explored and quantified. An algorithm is presented which, given an upper bound on acceptable residual vibration amplitude, determines a shaping strategy that is insensitive to errors in the estimated natural frequency. A procedure for shaping inputs to systems with input constraints is outlined. The shaping method is evaluated by dynamic simulations and hardware experiments.
Resumo:
A polynomial time algorithm (pruned correspondence search, PCS) with good average case performance for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3D objects from a single 2D image, is presented. Efficient verification algorithms, based on a linear representation of location constraints, are given for the case of affine transformations among vector spaces and for the case of rigid 2D and 3D transformations with scale. Some preliminary experiments suggest that PCS is a practical algorithm. Its similarity to existing correspondence based algorithms means that a number of existing techniques for speedup can be incorporated into PCS to improve its performance.
Resumo:
We present a novel ridge detector that finds ridges on vector fields. It is designed to automatically find the right scale of a ridge even in the presence of noise, multiple steps and narrow valleys. One of the key features of such ridge detector is that it has a zero response at discontinuities. The ridge detector can be applied to scalar and vector quantities such as color. We also present a parallel perceptual organization scheme based on such ridge detector that works without edges; in addition to perceptual groups, the scheme computes potential focus of attention points at which to direct future processing. The relation to human perception and several theoretical findings supporting the scheme are presented. We also show results of a Connection Machine implementation of the scheme for perceptual organization (without edges) using color.
Resumo:
The interpretation and recognition of noisy contours, such as silhouettes, have proven to be difficult. One obstacle to the solution of these problems has been the lack of a robust representation for contours. The contour is represented by a set of pairwise tangent circular arcs. The advantage of such an approach is that mathematical properties such as orientation and curvature are explicityly represented. We introduce a smoothing criterion for the contour tht optimizes the tradeoff between the complexity of the contour and proximity of the data points. The complexity measure is the number of extrema of curvature present in the contour. The smoothing criterion leads us to a true scale-space for contours. We describe the computation of the contour representation as well as the computation of relevant properties of the contour. We consider the potential application of the representation, the smoothing paradigm, and the scale-space to contour interpretation and recognition.
Resumo:
Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation is believed to be a viable alternative: they promise generalization while allowing the use of existing reinforcement-learning methods. Yet, there are few experiments on learning with deictic representations reported in the literature. In this paper we explore the effectiveness of two forms of deictic representation and a naive propositional representation in a simple blocks-world domain. We find, empirically, that the deictic representations actually worsen performance. We conclude with a discussion of possible causes of these results and strategies for more effective learning in domains with objects.
Resumo:
This paper explores the relationships between a computation theory of temporal representation (as developed by James Allen) and a formal linguistic theory of tense (as developed by Norbert Hornstein) and aspect. It aims to provide explicit answers to four fundamental questions: (1) what is the computational justification for the primitive of a linguistic theory; (2) what is the computational explanation of the formal grammatical constraints; (3) what are the processing constraints imposed on the learnability and markedness of these theoretical constructs; and (4) what are the constraints that a linguistic theory imposes on representations. We show that one can effectively exploit the interface between the language faculty and the cognitive faculties by using linguistic constraints to determine restrictions on the cognitive representation and vice versa. Three main results are obtained: (1) We derive an explanation of an observed grammatical constraint on tense?? Linear Order Constraint??m the information monotonicity property of the constraint propagation algorithm of Allen's temporal system: (2) We formulate a principle of markedness for the basic tense structures based on the computational efficiency of the temporal representations; and (3) We show Allen's interval-based temporal system is not arbitrary, but it can be used to explain independently motivated linguistic constraints on tense and aspect interpretations. We also claim that the methodology of research developed in this study??oss-level" investigation of independently motivated formal grammatical theory and computational models??a powerful paradigm with which to attack representational problems in basic cognitive domains, e.g., space, time, causality, etc.
Resumo:
This research is concerned with designing representations for analytical reasoning problems (of the sort found on the GRE and LSAT). These problems test the ability to draw logical conclusions. A computer program was developed that takes as input a straightforward predicate calculus translation of a problem, requests additional information if necessary, decides what to represent and how, designs representations capturing the constraints of the problem, and creates and executes a LISP program that uses those representations to produce a solution. Even though these problems are typically difficult for theorem provers to solve, the LISP program that uses the designed representations is very efficient.
Resumo:
This work addresses two related questions. The first question is what joint time-frequency energy representations are most appropriate for auditory signals, in particular, for speech signals in sonorant regions. The quadratic transforms of the signal are examined, a large class that includes, for example, the spectrograms and the Wigner distribution. Quasi-stationarity is not assumed, since this would neglect dynamic regions. A set of desired properties is proposed for the representation: (1) shift-invariance, (2) positivity, (3) superposition, (4) locality, and (5) smoothness. Several relations among these properties are proved: shift-invariance and positivity imply the transform is a superposition of spectrograms; positivity and superposition are equivalent conditions when the transform is real; positivity limits the simultaneous time and frequency resolution (locality) possible for the transform, defining an uncertainty relation for joint time-frequency energy representations; and locality and smoothness tradeoff by the 2-D generalization of the classical uncertainty relation. The transform that best meets these criteria is derived, which consists of two-dimensionally smoothed Wigner distributions with (possibly oriented) 2-D guassian kernels. These transforms are then related to time-frequency filtering, a method for estimating the time-varying 'transfer function' of the vocal tract, which is somewhat analogous to ceptstral filtering generalized to the time-varying case. Natural speech examples are provided. The second question addressed is how to obtain a rich, symbolic description of the phonetically relevant features in these time-frequency energy surfaces, the so-called schematic spectrogram. Time-frequency ridges, the 2-D analog of spectral peaks, are one feature that is proposed. If non-oriented kernels are used for the energy representation, then the ridge tops can be identified, with zero-crossings in the inner product of the gradient vector and the direction of greatest downward curvature. If oriented kernels are used, the method can be generalized to give better orientation selectivity (e.g., at intersecting ridges) at the cost of poorer time-frequency locality. Many speech examples are given showing the performance for some traditionally difficult cases: semi-vowels and glides, nasalized vowels, consonant-vowel transitions, female speech, and imperfect transmission channels.
Resumo:
This report shows how knowledge about the visual world can be built into a shape representation in the form of a descriptive vocabulary making explicit the important geometrical relationships comprising objects' shapes. Two computational tools are offered: (1) Shapestokens are placed on a Scale-Space Blackboard, (2) Dimensionality-reduction captures deformation classes in configurations of tokens. Knowledge lies in the token types and deformation classes tailored to the constraints and regularities ofparticular shape worlds. A hierarchical shape vocabulary has been implemented supporting several later visual tasks in the two-dimensional shape domain of the dorsal fins of fishes.
Resumo:
TYPICAL is a package for describing and making automatic inferences about a broad class of SCHEME predicate functions. These functions, called types following popular usage, delineate classes of primitive SCHEME objects, composite data structures, and abstract descriptions. TYPICAL types are generated by an extensible combinator language from either existing types or primitive terminals. These generated types are located in a lattice of predicate subsumption which captures necessary entailment between types; if satisfaction of one type necessarily entail satisfaction of another, the first type is below the second in the lattice. The inferences make by TYPICAL computes the position of the new definition within the lattice and establishes it there. This information is then accessible to both later inferences and other programs (reasoning systems, code analyzers, etc) which may need the information for their own purposes. TYPICAL was developed as a representation language for the discovery program Cyrano; particular examples are given of TYPICAL's application in the Cyrano program.
Resumo:
This paper describes ARLO, a representation language loosely modelled after Greiner and Lenant's RLL-1. ARLO is a structure-based representation language for describing structure-based representation languages, including itself. A given representation language is specified in ARLO by a collection of structures describing how its descriptions are interpreted, defaulted, and verified. This high level description is compiles into lisp code and ARLO structures whose interpretation fulfills the specified semantics of the representation. In addition, ARLO itself- as a representation language for expressing and compiling partial and complete language specifications- is described and interpreted in the same manner as the language it describes and implements. This self-description can be extended of modified to expand or alter the expressive power of ARLO's initial configuration. Languages which describe themselves like ARLO- provide powerful mediums for systems which perform automatic self-modification, optimization, debugging, or documentation. AI systems implemented in such a self-descriptive language can reflect on their own capabilities and limitations, applying general learning and problem solving strategies to enlarge or alleviate them.
Resumo:
This report describes a system which maintains canonical expressions for designators under a set of equalities. Substitution is used to maintain all knowledge in terms of these canonical expressions. A partial order on designators, termed the better-name relation, is used in the choice of canonical expressions. It is shown that with an appropriate better-name relation an important engineering reasoning technique, propagation of constraints, can be implemented as a special case of this substitution process. Special purpose algebraic simplification procedures are embedded such that they interact effectively with the equality system. An electrical circuit analysis system is developed which relies upon constraint propagation and algebraic simplification as primary reasoning techniques. The reasoning is guided by a better-name relation in which referentially transparent terms are preferred to referentially opaque ones. Multiple description of subcircuits are shown to interact strongly with the reasoning mechanism.
Resumo:
This thesis explores how to represent image texture in order to obtain information about the geometry and structure of surfaces, with particular emphasis on locating surface discontinuities. Theoretical and psychophysical results lead to the following conclusions for the representation of image texture: (1) A texture edge primitive is needed to identify texture change contours, which are formed by an abrupt change in the 2-D organization of similar items in an image. The texture edge can be used for locating discontinuities in surface structure and surface geometry and for establishing motion correspondence. (2) Abrupt changes in attributes that vary with changing surface geometry ??ientation, density, length, and width ??ould be used to identify discontinuities in surface geometry and surface structure. (3) Texture tokens are needed to separate the effects of different physical processes operating on a surface. They represent the local structure of the image texture. Their spatial variation can be used in the detection of texture discontinuities and texture gradients, and their temporal variation may be used for establishing motion correspondence. What precisely constitutes the texture tokens is unknown; it appears, however, that the intensity changes alone will not suffice, but local groupings of them may. (4) The above primitives need to be assigned rapidly over a large range in an image.