4 resultados para FINITE SETS
em Massachusetts Institute of Technology
Resumo:
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.
Resumo:
In a recent seminal paper, Gibson and Wexler (1993) take important steps to formalizing the notion of language learning in a (finite) space whose grammars are characterized by a finite number of parameters. They introduce the Triggering Learning Algorithm (TLA) and show that even in finite space convergence may be a problem due to local maxima. In this paper we explicitly formalize learning in finite parameter space as a Markov structure whose states are parameter settings. We show that this captures the dynamics of TLA completely and allows us to explicitly compute the rates of convergence for TLA and other variants of TLA e.g. random walk. Also included in the paper are a corrected version of GW's central convergence proof, a list of "problem states" in addition to local maxima, and batch and PAC-style learning bounds for the model.
Resumo:
In this thesis, two different sets of experiments are described. The first is an exploration of the microscopic superfluidity of dilute gaseous Bose- Einstein condensates. The second set of experiments were performed using transported condensates in a new BEC apparatus. Superfluidity was probed by moving impurities through a trapped condensate. The impurities were created using an optical Raman transition, which transferred a small fraction of the atoms into an untrapped hyperfine state. A dramatic reduction in the collisions between the moving impurities and the condensate was observed when the velocity of the impurities was close to the speed of sound of the condensate. This reduction was attributed to the superfluid properties of a BEC. In addition, we observed an increase in the collisional density as the number of impurity atoms increased. This enhancement is an indication of bosonic stimulation by the occupied final states. This stimulation was observed both at small and large velocities relative to the speed of sound. A theoretical calculation of the effect of finite temperature indicated that collision rate should be enhanced at small velocities due to thermal excitations. However, in the current experiments we were insensitive to this effect. Finally, the factor of two between the collisional rate between indistinguishable and distinguishable atoms was confirmed. A new BEC apparatus that can transport condensates using optical tweezers was constructed. Condensates containing 10-15 million sodium atoms were produced in 20 s using conventional BEC production techniques. These condensates were then transferred into an optical trap that was translated from the âproduction chamber’ into a separate vacuum chamber: the âscience chamber’. Typically, we transferred 2-3 million condensed atoms in less than 2 s. This transport technique avoids optical and mechanical constrainsts of conventional condensate experiments and allows for the possibility of novel experiments. In the first experiments using transported BEC, we loaded condensed atoms from the optical tweezers into both macroscopic and miniaturized magnetic traps. Using microfabricated wires on a silicon chip, we observed excitation-less propagation of a BEC in a magnetic waveguide. The condensates fragmented when brought very close to the wire surface indicating that imperfections in the fabrication process might limit future experiments. Finally, we generated a continuous BEC source by periodically replenishing a condensate held in an optical reservoir trap using fresh condensates delivered using optical tweezers. More than a million condensed atoms were always present in the continuous source, raising the possibility of realizing a truly continuous atom lase.
Resumo:
Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.