15 resultados para sets of words
em Massachusetts Institute of Technology
Resumo:
Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.
Resumo:
Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex.
Resumo:
We consider the problem of matching model and sensory data features in the presence of geometric uncertainty, for the purpose of object localization and identification. The problem is to construct sets of model feature and sensory data feature pairs that are geometrically consistent given that there is uncertainty in the geometry of the sensory data features. If there is no geometric uncertainty, polynomial-time algorithms are possible for feature matching, yet these approaches can fail when there is uncertainty in the geometry of data features. Existing matching and recognition techniques which account for the geometric uncertainty in features either cannot guarantee finding a correct solution, or can construct geometrically consistent sets of feature pairs yet have worst case exponential complexity in terms of the number of features. The major new contribution of this work is to demonstrate a polynomial-time algorithm for constructing sets of geometrically consistent feature pairs given uncertainty in the geometry of the data features. We show that under a certain model of geometric uncertainty the feature matching problem in the presence of uncertainty is of polynomial complexity. This has important theoretical implications by demonstrating an upper bound on the complexity of the matching problem, an by offering insight into the nature of the matching problem itself. These insights prove useful in the solution to the matching problem in higher dimensional cases as well, such as matching three-dimensional models to either two or three-dimensional sensory data. The approach is based on an analysis of the space of feasible transformation parameters. This paper outlines the mathematical basis for the method, and describes the implementation of an algorithm for the procedure. Experiments demonstrating the method are reported.
Resumo:
A procedure is given for recognizing sets of inference rules that generate polynomial time decidable inference relations. The procedure can automatically recognize the tractability of the inference rules underlying congruence closure. The recognition of tractability for that particular rule set constitutes mechanical verification of a theorem originally proved independently by Kozen and Shostak. The procedure is algorithmic, rather than heuristic, and the class of automatically recognizable tractable rule sets can be precisely characterized. A series of examples of rule sets whose tractability is non-trivial, yet machine recognizable, is also given. The technical framework developed here is viewed as a first step toward a general theory of tractable inference relations.
Resumo:
I wish to propose a quite speculative new version of the grandmother cell theory to explain how the brain, or parts of it, may work. In particular, I discuss how the visual system may learn to recognize 3D objects. The model would apply directly to the cortical cells involved in visual face recognition. I will also outline the relation of our theory to existing models of the cerebellum and of motor control. Specific biophysical mechanisms can be readily suggested as part of a basic type of neural circuitry that can learn to approximate multidimensional input-output mappings from sets of examples and that is expected to be replicated in different regions of the brain and across modalities. The main points of the theory are: -the brain uses modules for multivariate function approximation as basic components of several of its information processing subsystems. -these modules are realized as HyperBF networks (Poggio and Girosi, 1990a,b). -HyperBF networks can be implemented in terms of biologically plausible mechanisms and circuitry. The theory predicts a specific type of population coding that represents an extension of schemes such as look-up tables. I will conclude with some speculations about the trade-off between memory and computation and the evolution of intelligence.
Resumo:
This research is concerned with the development of tactual displays to supplement the information available through lipreading. Because voicing carries a high informational load in speech and is not well transmitted through lipreading, the efforts are focused on providing tactual displays of voicing to supplement the information available on the lips of the talker. This research includes exploration of 1) signal-processing schemes to extract information about voicing from the acoustic speech signal, 2) methods of displaying this information through a multi-finger tactual display, and 3) perceptual evaluations of voicing reception through the tactual display alone (T), lipreading alone (L), and the combined condition (L+T). Signal processing for the extraction of voicing information used amplitude-envelope signals derived from filtered bands of speech (i.e., envelopes derived from a lowpass-filtered band at 350 Hz and from a highpass-filtered band at 3000 Hz). Acoustic measurements made on the envelope signals of a set of 16 initial consonants represented through multiple tokens of C1VC2 syllables indicate that the onset-timing difference between the low- and high-frequency envelopes (EOA: envelope-onset asynchrony) provides a reliable and robust cue for distinguishing voiced from voiceless consonants. This acoustic cue was presented through a two-finger tactual display such that the envelope of the high-frequency band was used to modulate a 250-Hz carrier signal delivered to the index finger (250-I) and the envelope of the low-frequency band was used to modulate a 50-Hz carrier delivered to the thumb (50T). The temporal-onset order threshold for these two signals, measured with roving signal amplitude and duration, averaged 34 msec, sufficiently small for use of the EOA cue. Perceptual evaluations of the tactual display of EOA with speech signal indicated: 1) that the cue was highly effective for discrimination of pairs of voicing contrasts; 2) that the identification of 16 consonants was improved by roughly 15 percentage points with the addition of the tactual cue over L alone; and 3) that no improvements in L+T over L were observed for reception of words in sentences, indicating the need for further training on this task
Resumo:
Does knowledge of language consist of symbolic rules? How do children learn and use their linguistic knowledge? To elucidate these questions, we present a computational model that acquires phonological knowledge from a corpus of common English nouns and verbs. In our model the phonological knowledge is encapsulated as boolean constraints operating on classical linguistic representations of speech sounds in term of distinctive features. The learning algorithm compiles a corpus of words into increasingly sophisticated constraints. The algorithm is incremental, greedy, and fast. It yields one-shot learning of phonological constraints from a few examples. Our system exhibits behavior similar to that of young children learning phonological knowledge. As a bonus the constraints can be interpreted as classical linguistic rules. The computational model can be implemented by a surprisingly simple hardware mechanism. Our mechanism also sheds light on a fundamental AI question: How are signals related to symbols?
Resumo:
The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents.
Resumo:
In this thesis, two different sets of experiments are described. The first is an exploration of the microscopic superfluidity of dilute gaseous Bose- Einstein condensates. The second set of experiments were performed using transported condensates in a new BEC apparatus. Superfluidity was probed by moving impurities through a trapped condensate. The impurities were created using an optical Raman transition, which transferred a small fraction of the atoms into an untrapped hyperfine state. A dramatic reduction in the collisions between the moving impurities and the condensate was observed when the velocity of the impurities was close to the speed of sound of the condensate. This reduction was attributed to the superfluid properties of a BEC. In addition, we observed an increase in the collisional density as the number of impurity atoms increased. This enhancement is an indication of bosonic stimulation by the occupied final states. This stimulation was observed both at small and large velocities relative to the speed of sound. A theoretical calculation of the effect of finite temperature indicated that collision rate should be enhanced at small velocities due to thermal excitations. However, in the current experiments we were insensitive to this effect. Finally, the factor of two between the collisional rate between indistinguishable and distinguishable atoms was confirmed. A new BEC apparatus that can transport condensates using optical tweezers was constructed. Condensates containing 10-15 million sodium atoms were produced in 20 s using conventional BEC production techniques. These condensates were then transferred into an optical trap that was translated from the âproduction chamber’ into a separate vacuum chamber: the âscience chamber’. Typically, we transferred 2-3 million condensed atoms in less than 2 s. This transport technique avoids optical and mechanical constrainsts of conventional condensate experiments and allows for the possibility of novel experiments. In the first experiments using transported BEC, we loaded condensed atoms from the optical tweezers into both macroscopic and miniaturized magnetic traps. Using microfabricated wires on a silicon chip, we observed excitation-less propagation of a BEC in a magnetic waveguide. The condensates fragmented when brought very close to the wire surface indicating that imperfections in the fabrication process might limit future experiments. Finally, we generated a continuous BEC source by periodically replenishing a condensate held in an optical reservoir trap using fresh condensates delivered using optical tweezers. More than a million condensed atoms were always present in the continuous source, raising the possibility of realizing a truly continuous atom lase.
Resumo:
An investigation is made into the problem of constructing a model of the appearance to an optical input device of scenes consisting of plane-faced geometric solids. The goal is to study algorithms which find the real straight edges in the scenes, taking into account smooth variations in intensity over faces of the solids, blurring of edges and noise. A general mathematical analysis is made of optimal methods for identifying the edge lines in figures, given a raster of intensities covering the entire field of view. There is given in addition a suboptimal statistical decision procedure, based on the model, for the identification of a line within a narrow band on the field of view given an array of intensities from within the band. A computer program has been written and extensively tested which implements this procedure and extracts lines from real scenes. Other programs were written which judge the completeness of extracted sets of lines, and propose and test for additional lines which had escaped initial detection. The performance of these programs is discussed in relation to the theory derived from the model, and with regard to their use of global information in detecting and proposing lines.
Resumo:
How can one represent the meaning of English sentences in a formal logical notation such that the translation of English into this logical form is simple and general? This report answers this question for a particular kind of meaning, namely quantifier scope, and for a particular part of the translation, namely the syntactic influence on the translation. Rules are presented which predict, for example, that the sentence: Everyone in this room speaks at least two languages. has the quantifier scope AE in standard predicate calculus, while the sentence: At lease two languages are spoken by everyone in this room. has the quantifier scope EA. Three different logical forms are presented, and their translation rules are examined. One of the logical forms is predicate calculus. The translation rules for it were developed by Robert May (May 19 77). The other two logical forms are Skolem form and a simple computer programming language. The translation rules for these two logical forms are new. All three sets of translation rules are shown to be general, in the sense that the same rules express the constraints that syntax imposes on certain other linguistic phenomena. For example, the rules that constrain the translation into Skolem form are shown to constrain definite np anaphora as well. A large body of carefully collected data is presented, and used to assess the empirical accuracy of each of the theories. None of the three theories is vastly superior to the others. However, the report concludes by suggesting that a combination of the two newer theories would have the greatest generality and the highest empirical accuracy.
Resumo:
This thesis presents the ideas underlying a computer program that takes as input a schematic of a mechanical or hydraulic power transmission system, plus specifications and a utility function, and returns catalog numbers from predefined catalogs for the optimal selection of components implementing the design. Unlike programs for designing single components or systems, the program provides the designer with a high level "language" in which to compose new designs. It then performs some of the detailed design process. The process of "compilation" is based on a formalization of quantitative inferences about hierarchically organized sets of artifacts and operating conditions. This allows the design compilation without the exhaustive enumeration of alternatives.
Resumo:
The central thesis of this report is that human language is NP-complete. That is, the process of comprehending and producing utterances is bounded above by the class NP, and below by NP-hardness. This constructive complexity thesis has two empirical consequences. The first is to predict that a linguistic theory outside NP is unnaturally powerful. The second is to predict that a linguistic theory easier than NP-hard is descriptively inadequate. To prove the lower bound, I show that the following three subproblems of language comprehension are all NP-hard: decide whether a given sound is possible sound of a given language; disambiguate a sequence of words; and compute the antecedents of pronouns. The proofs are based directly on the empirical facts of the language user's knowledge, under an appropriate idealization. Therefore, they are invariant across linguistic theories. (For this reason, no knowledge of linguistic theory is needed to understand the proofs, only knowledge of English.) To illustrate the usefulness of the upper bound, I show that two widely-accepted analyses of the language user's knowledge (of syntactic ellipsis and phonological dependencies) lead to complexity outside of NP (PSPACE-hard and Undecidable, respectively). Next, guided by the complexity proofs, I construct alternate linguisitic analyses that are strictly superior on descriptive grounds, as well as being less complex computationally (in NP). The report also presents a new framework for linguistic theorizing, that resolves important puzzles in generative linguistics, and guides the mathematical investigation of human language.
Resumo:
At the time of a customer order, the e-tailer assigns the order to one or more of its order fulfillment centers, and/or to drop shippers, so as to minimize procurement and transportation costs, based on the available current information. However this assignment is necessarily myopic as it cannot account for all future events, such as subsequent customer orders or inventory replenishments. We examine the potential benefits from periodically re-evaluating these real-time order-assignment decisions. We construct near-optimal heuristics for the re-assignment for a large set of customer orders with the objective to minimize the total number of shipments. We investigate how best to implement these heuristics for a rolling horizon, and discuss the effect of demand correlation, customer order size, and the number of customer orders on the nature of the heuristics. Finally, we present potential saving opportunities by testing the heuristics on sets of order data from a major e-tailer.
Resumo:
The release of growth factors from tissue engineering scaffolds provides signals that influence the migration, differentiation, and proliferation of cells. The incorporation of a drug delivery platform that is capable of tunable release will give tissue engineers greater versatility in the direction of tissue regeneration. We have prepared a novel composite of two biomaterials with proven track records - apatite and poly(lactic-co-glycolic acid) (PLGA) – as a drug delivery platform with promising controlled release properties. These composites have been tested in the delivery of a model protein, bovine serum albumin (BSA), as well as therapeutic proteins, recombinant human bone morphogenetic protein-2 (rhBMP-2) and rhBMP-6. The controlled release strategy is based on the use of a polymer with acidic degradation products to control the dissolution of the basic apatitic component, resulting in protein release. Therefore, any parameter that affects either polymer degradation or apatite dissolution can be used to control protein release. We have modified the protein release profile systematically by varying the polymer molecular weight, polymer hydrophobicity, apatite loading, apatite particle size, and other material and processing parameters. Biologically active rhBMP-2 was released from these composite microparticles over 100 days, in contrast to conventional collagen sponge carriers, which were depleted in approximately 2 weeks. The released rhBMP-2 was able to induce elevated alkaline phosphatase and osteocalcin expression in pluripotent murine embryonic fibroblasts. To augment tissue engineering scaffolds with tunable and sustained protein release capabilities, these composite microparticles can be dispersed in the scaffolds in different combinations to obtain a superposition of the release profiles. We have loaded rhBMP-2 into composite microparticles with a fast release profile, and rhBMP-6 into slow-releasing composite microparticles. An equi-mixture of these two sets of composite particles was then injected into a collagen sponge, allowing for dual release of the proteins from the collagenous scaffold. The ability of these BMP-loaded scaffolds to induce osteoblastic differentiation in vitro and ectopic bone formation in a rat model is being investigated. We anticipate that these apatite-polymer composite microparticles can be extended to the delivery of other signalling molecules, and can be incorporated into other types of tissue engineering scaffolds.