3 resultados para Practical training

em CaltechTHESIS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Be it a physical object or a mathematical model, a nonlinear dynamical system can display complicated aperiodic behavior, or "chaos." In many cases, this chaos is associated with motion on a strange attractor in the system's phase space. And the dimension of the strange attractor indicates the effective number of degrees of freedom in the dynamical system.

In this thesis, we investigate numerical issues involved with estimating the dimension of a strange attractor from a finite time series of measurements on the dynamical system.

Of the various definitions of dimension, we argue that the correlation dimension is the most efficiently calculable and we remark further that it is the most commonly calculated. We are concerned with the practical problems that arise in attempting to compute the correlation dimension. We deal with geometrical effects (due to the inexact self-similarity of the attractor), dynamical effects (due to the nonindependence of points generated by the dynamical system that defines the attractor), and statistical effects (due to the finite number of points that sample the attractor). We propose a modification of the standard algorithm, which eliminates a specific effect due to autocorrelation, and a new implementation of the correlation algorithm, which is computationally efficient.

Finally, we apply the algorithm to chaotic data from the Caltech tokamak and the Texas tokamak (TEXT); we conclude that plasma turbulence is not a low- dimensional phenomenon.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of pseudoephedrine as a practical chiral auxiliary for asymmetric synthesis is describe. Both enantiomers of pseudoephedrine are inexpensive commodity chemicals and can be N-acylated in high yields to form tertiary amides. In the presence of lithium chloride, the enolates of the corresponding pseudoephedrine amides undergo highly diastereoselective a1kylations with a wide range of alkyl halides to afford α-substituted products in high yields. These products can then be transformed in a single operation into highly enantiomerically enriched carboxylic acids, alcohols, and aldehydes. Lithium amidotrihydroborate (LAB) is shown to be a powerful reductant for the selective reduction of tertiary amides in general and pseudoephedrine amides in particular to form primary alcohols.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.

In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.

Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.

In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.