11 resultados para computational statistics
em Massachusetts Institute of Technology
Resumo:
Humans distinguish materials such as metal, plastic, and paper effortlessly at a glance. Traditional computer vision systems cannot solve this problem at all. Recognizing surface reflectance properties from a single photograph is difficult because the observed image depends heavily on the amount of light incident from every direction. A mirrored sphere, for example, produces a different image in every environment. To make matters worse, two surfaces with different reflectance properties could produce identical images. The mirrored sphere simply reflects its surroundings, so in the right artificial setting, it could mimic the appearance of a matte ping-pong ball. Yet, humans possess an intuitive sense of what materials typically "look like" in the real world. This thesis develops computational algorithms with a similar ability to recognize reflectance properties from photographs under unknown, real-world illumination conditions. Real-world illumination is complex, with light typically incident on a surface from every direction. We find, however, that real-world illumination patterns are not arbitrary. They exhibit highly predictable spatial structure, which we describe largely in the wavelet domain. Although they differ in several respects from the typical photographs, illumination patterns share much of the regularity described in the natural image statistics literature. These properties of real-world illumination lead to predictable image statistics for a surface with given reflectance properties. We construct a system that classifies a surface according to its reflectance from a single photograph under unknown illuminination. Our algorithm learns relationships between surface reflectance and certain statistics computed from the observed image. Like the human visual system, we solve the otherwise underconstrained inverse problem of reflectance estimation by taking advantage of the statistical regularity of illumination. For surfaces with homogeneous reflectance properties and known geometry, our system rivals human performance.
Resumo:
We review the progress made in computational vision, as represented by Marr's approach, in the last fifteen years. First, we briefly outline computational theories developed for low, middle and high-level vision. We then discuss in more detail solutions proposed to three representative problems in vision, each dealing with a different level of visual processing. Finally, we discuss modifications to the currently established computational paradigm that appear to be dictated by the recent developments in vision.
Resumo:
The computer science technique of computational complexity analysis can provide powerful insights into the algorithm-neutral analysis of information processing tasks. Here we show that a simple, theory-neutral linguistic model of syntactic agreement and ambiguity demonstrates that natural language parsing may be computationally intractable. Significantly, we show that it may be syntactic features rather than rules that can cause this difficulty. Informally, human languages and the computationally intractable Satisfiability (SAT) problem share two costly computional mechanisms: both enforce agreement among symbols across unbounded distances (Subject-Verb agreement) and both allow ambiguity (is a word a Noun or a Verb?).
Resumo:
This thesis introduces elements of a theory of design activity and a computational framework for developing design systems. The theory stresses the opportunistic nature of designing and the complementary roles of focus and distraction, the interdependence of evaluation and generation, the multiplicity of ways of seeing over the history of a design session versus the exclusivity of a given way of seeing over an arbitrarily short period, and the incommensurability of criteria used to evaluate a design. The thesis argues for a principle based rather than rule based approach to designing documents. The Discursive Generator is presented as a computational framework for implementing specific design systems, and a simple system for arranging blocks according to a set of formal principles is developed by way of illustration. Both shape grammars and constraint based systems are used to contrast current trends in design automation with the discursive approach advocated in the thesis. The Discursive Generator is shown to have some important properties lacking in other types of systems, such as dynamism, robustness and the ability to deal with partial designs. When studied in terms of a search metaphor, the Discursive Generator is shown to exhibit behavior which is radically different from some traditional search techniques, and to avoid some of the well-known difficulties associated with them.
Resumo:
Does knowledge of language consist of symbolic rules? How do children learn and use their linguistic knowledge? To elucidate these questions, we present a computational model that acquires phonological knowledge from a corpus of common English nouns and verbs. In our model the phonological knowledge is encapsulated as boolean constraints operating on classical linguistic representations of speech sounds in term of distinctive features. The learning algorithm compiles a corpus of words into increasingly sophisticated constraints. The algorithm is incremental, greedy, and fast. It yields one-shot learning of phonological constraints from a few examples. Our system exhibits behavior similar to that of young children learning phonological knowledge. As a bonus the constraints can be interpreted as classical linguistic rules. The computational model can be implemented by a surprisingly simple hardware mechanism. Our mechanism also sheds light on a fundamental AI question: How are signals related to symbols?
Resumo:
Humans recognize optical reflectance properties of surfaces such as metal, plastic, or paper from a single image without knowledge of illumination. We develop a machine vision system to perform similar recognition tasks automatically. Reflectance estimation under unknown, arbitrary illumination proves highly underconstrained due to the variety of potential illumination distributions and surface reflectance properties. We have found that the spatial structure of real-world illumination possesses some of the statistical regularities observed in the natural image statistics literature. A human or computer vision system may be able to exploit this prior information to determine the most likely surface reflectance given an observed image. We develop an algorithm for reflectance classification under unknown real-world illumination, which learns relationships between surface reflectance and certain features (statistics) computed from a single observed image. We also develop an automatic feature selection method.
Resumo:
This report describes a computational system with which phonologists may describe a natural language in terms of autosegmental phonology, currently the most advanced theory pertaining to the sound systems of human languages. This system allows linguists to easily test autosegmental hypotheses against a large corpus of data. The system was designed primarily with tonal systems in mind, but also provides support for tree or feature matrix representation of phonemes (as in The Sound Pattern of English), as well as syllable structures and other aspects of phonological theory. Underspecification is allowed, and trees may be specified before, during, and after rule application. The association convention is automatically applied, and other principles such as the conjunctivity condition are supported. The method of representation was designed such that rules are designated in as close a fashion as possible to the existing conventions of autosegmental theory while adhering to a textual constraint for maximum portability.
Resumo:
This thesis describes an investigation of retinal directional selectivity. We show intracellular (whole-cell patch) recordings in turtle retina which indicate that this computation occurs prior to the ganglion cell, and we describe a pre-ganglionic circuit model to account for this and other findings which places the non-linear spatio-temporal filter at individual, oriented amacrine cell dendrites. The key non-linearity is provided by interactions between excitatory and inhibitory synaptic inputs onto the dendrites, and their distal tips provide directionally selective excitatory outputs onto ganglion cells. Detailed simulations of putative cells support this model, given reasonable parameter constraints. The performance of the model also suggests that this computational substructure may be relevant within the dendritic trees of CNS neurons in general.
Resumo:
The primary goal of this report is to demonstrate how considerations from computational complexity theory can inform grammatical theorizing. To this end, generalized phrase structure grammar (GPSG) linguistic theory is revised so that its power more closely matches the limited ability of an ideal speaker--hearer: GPSG Recognition is EXP-POLY time hard, while Revised GPSG Recognition is NP-complete. A second goal is to provide a theoretical framework within which to better understand the wide range of existing GPSG models, embodied in formal definitions as well as in implemented computer programs. A grammar for English and an informal explanation of the GPSG/RGPSG syntactic features are included in appendices.
Resumo:
This report investigates the process of focussing as a description and explanation of the comprehension of certain anaphoric expressions in English discourse. The investigation centers on the interpretation of definite anaphora, that is, on the personal pronouns, and noun phrases used with a definite article the, this or that. Focussing is formalized as a process in which a speaker centers attention on a particular aspect of the discourse. An algorithmic description specifies what the speaker can focus on and how the speaker may change the focus of the discourse as the discourse unfolds. The algorithm allows for a simple focussing mechanism to be constructed: and element in focus, an ordered collection of alternate foci, and a stack of old foci. The data structure for the element in focus is a representation which encodes a limted set of associations between it and other elements from teh discourse as well as from general knowledge.
Resumo:
This thesis confronts the nature of the process of learning an intellectual skill, the ability to solve problems efficiently in a particular domain of discourse. The investigation is synthetic; a computational performance model, HACKER, is displayed. Hacker is a computer problem-solving system whose performance improves with practice. HACKER maintains performance knowledge as a library of procedures indexed by descriptions of the problem types for which the procedures are appropriate. When applied to a problem, HACKER tries to use a procedure from this "Answer Library". If no procedure is found to be applicable, HACKER writes one using more general knowledge of the problem domain and of programming techniques. This new program may be generalized and added to the Answer Library.