950 resultados para Time Machine
Resumo:
In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.
In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.
Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.
In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.
Resumo:
Optical Coherence Tomography(OCT) is a popular, rapidly growing imaging technique with an increasing number of bio-medical applications due to its noninvasive nature. However, there are three major challenges in understanding and improving an OCT system: (1) Obtaining an OCT image is not easy. It either takes a real medical experiment or requires days of computer simulation. Without much data, it is difficult to study the physical processes underlying OCT imaging of different objects simply because there aren't many imaged objects. (2) Interpretation of an OCT image is also hard. This challenge is more profound than it appears. For instance, it would require a trained expert to tell from an OCT image of human skin whether there is a lesion or not. This is expensive in its own right, but even the expert cannot be sure about the exact size of the lesion or the width of the various skin layers. The take-away message is that analyzing an OCT image even from a high level would usually require a trained expert, and pixel-level interpretation is simply unrealistic. The reason is simple: we have OCT images but not their underlying ground-truth structure, so there is nothing to learn from. (3) The imaging depth of OCT is very limited (millimeter or sub-millimeter on human tissues). While OCT utilizes infrared light for illumination to stay noninvasive, the downside of this is that photons at such long wavelengths can only penetrate a limited depth into the tissue before getting back-scattered. To image a particular region of a tissue, photons first need to reach that region. As a result, OCT signals from deeper regions of the tissue are both weak (since few photons reached there) and distorted (due to multiple scatterings of the contributing photons). This fact alone makes OCT images very hard to interpret.
This thesis addresses the above challenges by successfully developing an advanced Monte Carlo simulation platform which is 10000 times faster than the state-of-the-art simulator in the literature, bringing down the simulation time from 360 hours to a single minute. This powerful simulation tool not only enables us to efficiently generate as many OCT images of objects with arbitrary structure and shape as we want on a common desktop computer, but it also provides us the underlying ground-truth of the simulated images at the same time because we dictate them at the beginning of the simulation. This is one of the key contributions of this thesis. What allows us to build such a powerful simulation tool includes a thorough understanding of the signal formation process, clever implementation of the importance sampling/photon splitting procedure, efficient use of a voxel-based mesh system in determining photon-mesh interception, and a parallel computation of different A-scans that consist a full OCT image, among other programming and mathematical tricks, which will be explained in detail later in the thesis.
Next we aim at the inverse problem: given an OCT image, predict/reconstruct its ground-truth structure on a pixel level. By solving this problem we would be able to interpret an OCT image completely and precisely without the help from a trained expert. It turns out that we can do much better. For simple structures we are able to reconstruct the ground-truth of an OCT image more than 98% correctly, and for more complicated structures (e.g., a multi-layered brain structure) we are looking at 93%. We achieved this through extensive uses of Machine Learning. The success of the Monte Carlo simulation already puts us in a great position by providing us with a great deal of data (effectively unlimited), in the form of (image, truth) pairs. Through a transformation of the high-dimensional response variable, we convert the learning task into a multi-output multi-class classification problem and a multi-output regression problem. We then build a hierarchy architecture of machine learning models (committee of experts) and train different parts of the architecture with specifically designed data sets. In prediction, an unseen OCT image first goes through a classification model to determine its structure (e.g., the number and the types of layers present in the image); then the image is handed to a regression model that is trained specifically for that particular structure to predict the length of the different layers and by doing so reconstruct the ground-truth of the image. We also demonstrate that ideas from Deep Learning can be useful to further improve the performance.
It is worth pointing out that solving the inverse problem automatically improves the imaging depth, since previously the lower half of an OCT image (i.e., greater depth) can be hardly seen but now becomes fully resolved. Interestingly, although OCT signals consisting the lower half of the image are weak, messy, and uninterpretable to human eyes, they still carry enough information which when fed into a well-trained machine learning model spits out precisely the true structure of the object being imaged. This is just another case where Artificial Intelligence (AI) outperforms human. To the best knowledge of the author, this thesis is not only a success but also the first attempt to reconstruct an OCT image at a pixel level. To even give a try on this kind of task, it would require fully annotated OCT images and a lot of them (hundreds or even thousands). This is clearly impossible without a powerful simulation tool like the one developed in this thesis.
Resumo:
This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.
Resumo:
When tracking resources in large-scale, congested, outdoor construction sites, the cost and time for purchasing, installing and maintaining the position sensors needed to track thousands of materials, and hundreds of equipment and personnel can be significant. To alleviate this problem a novel vision based tracking method that allows each sensor (camera) to monitor the position of multiple entities simultaneously has been proposed. This paper presents the full-scale validation experiments for this method. The validation included testing the method under harsh conditions at a variety of mega-project construction sites. The procedure for collecting data from the sites, the testing procedure, metrics, and results are reported. Full-scale validation demonstrates that the novel vision tracking provides a good solution to track different entities on a large, congested construction site.
Resumo:
Manual inspection is required to determine the condition of damaged buildings after an earthquake. The lack of available inspectors, when combined with the large volume of inspection work, makes such inspection subjective and time-consuming. Completing the required inspection takes weeks to complete, which has adverse economic and societal impacts on the affected population. This paper proposes an automated framework for rapid post-earthquake building evaluation. Under the framework, the visible damage (cracks and buckling) inflicted on concrete columns is first detected. The damage properties are then measured in relation to the column's dimensions and orientation, so that the column's load bearing capacity can be approximated as a damage index. The column damage index supplemented with other building information (e.g. structural type and columns arrangement) is then used to query fragility curves of similar buildings, constructed from the analyses of existing and on-going experimental data. The query estimates the probability of the building being in different damage states. The framework is expected to automate the collection of building damage data, to provide a quantitative assessment of the building damage state, and to estimate the vulnerability of the building to collapse in the event of an aftershock. Videos and manual assessments of structures after the 2009 earthquake in Haiti are used to test the parts of the framework.
Resumo:
The existing machine vision-based 3D reconstruction software programs provide a promising low-cost and in some cases automatic solution for infrastructure as-built documentation. However in several steps of the reconstruction process, they only rely on detecting and matching corner-like features in multiple views of a scene. Therefore, in infrastructure scenes which include uniform materials and poorly textured surfaces, these programs fail with high probabilities due to lack of feature points. Moreover, except few programs that generate dense 3D models through significantly time-consuming algorithms, most of them only provide a sparse reconstruction which does not necessarily include required points such as corners or edges; hence these points have to be manually matched across different views that could make the process considerably laborious. To address these limitations, this paper presents a video-based as-built documentation method that automatically builds detailed 3D maps of a scene by aligning edge points between video frames. Compared to corner-like features, edge points are far more plentiful even in untextured scenes and often carry important semantic associations. The method has been tested for poorly textured infrastructure scenes and the results indicate that a combination of edge and corner-like features would allow dealing with a broader range of scenes.
Resumo:
The current procedures in post-earthquake safety and structural assessment are performed manually by a skilled triage team of structural engineers/certified inspectors. These procedures, and particularly the physical measurement of the damage properties, are time-consuming and qualitative in nature. This paper proposes a novel method that automatically detects spalled regions on the surface of reinforced concrete columns and measures their properties in image data. Spalling has been accepted as an important indicator of significant damage to structural elements during an earthquake. According to this method, the region of spalling is first isolated by way of a local entropy-based thresholding algorithm. Following this, the exposure of longitudinal reinforcement (depth of spalling into the column) and length of spalling along the column are measured using a novel global adaptive thresholding algorithm in conjunction with image processing methods in template matching and morphological operations. The method was tested on a database of damaged RC column images collected after the 2010 Haiti earthquake, and comparison of the results with manual measurements indicate the validity of the method.
Resumo:
The autoignition characteristics of methanol, ethanol and MTBE (methyl tert-butyl ether) have been investigated in a rapid compression machine at pressures in the range 20-40 atm and temperatures within 750-1000 K. All three oxygenated fuels tested show higher autoignition temperatures than paraffins, a trend consistent with the high octane number of these fuels. The autoignition delay time for methanol was slightly lower than predicted values using reported reaction mechanisms. However, the experimental and measured values for the activation energy are in very good agreement around 44 kcal/mol. The measured activation energy for ethanol autoignition is in good agreement with previous shock tube results (31 kcal/mol), although ignition times predicted by the shock tube correlation are a factor of three lower than the measured values. The measured activation energy for MTBE, 41.4 kcal/mol, was significantly higher than the value previously observed in shock tubes (28.1 kcal/mol). The onset of preignition, characterized by a slow energy release prior to early ignition was observed in some instances. Possible reasons for these ocurrences are discussed. © Copyright 1993 Society of Automotive engineers, Inc.