999 resultados para Machine costs


Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis studies three classes of randomized numerical linear algebra algorithms, namely: (i) randomized matrix sparsification algorithms, (ii) low-rank approximation algorithms that use randomized unitary transformations, and (iii) low-rank approximation algorithms for positive-semidefinite (PSD) matrices.

Randomized matrix sparsification algorithms set randomly chosen entries of the input matrix to zero. When the approximant is substituted for the original matrix in computations, its sparsity allows one to employ faster sparsity-exploiting algorithms. This thesis contributes bounds on the approximation error of nonuniform randomized sparsification schemes, measured in the spectral norm and two NP-hard norms that are of interest in computational graph theory and subset selection applications.

Low-rank approximations based on randomized unitary transformations have several desirable properties: they have low communication costs, are amenable to parallel implementation, and exploit the existence of fast transform algorithms. This thesis investigates the tradeoff between the accuracy and cost of generating such approximations. State-of-the-art spectral and Frobenius-norm error bounds are provided.

The last class of algorithms considered are SPSD "sketching" algorithms. Such sketches can be computed faster than approximations based on projecting onto mixtures of the columns of the matrix. The performance of several such sketching schemes is empirically evaluated using a suite of canonical matrices drawn from machine learning and data analysis applications, and a framework is developed for establishing theoretical error bounds.

In addition to studying these algorithms, this thesis extends the Matrix Laplace Transform framework to derive Chernoff and Bernstein inequalities that apply to all the eigenvalues of certain classes of random matrices. These inequalities are used to investigate the behavior of the singular values of a matrix under random sampling, and to derive convergence rates for each individual eigenvalue of a sample covariance matrix.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

30 p.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

35 p.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[ES]Los objetivos del siguiente trabajo consisten en analizar e optimizar el proceso del torneado en duro del acero ASP-23 indagando de especial manera en la realización de diferentes soluciones para brochas. En este caso, este proyecto nace de la importancia de reducir así como los costes económicos y los costes temporales de fabricación de elementos basados en el acero ASP-23 mediante el torneado en duro; proceso de mecanizado, cuya importancia cada vez es mayor como en las industrias de automoción o aeronáutica. El desarrollo del proyecto es fruto de la necesidad de EKIN S. Coop, uno de los líderes en los procesos de máquina-herramienta de alta precisión para el brochado, de desarrollar un proceso de mecanizado más eficaz de las brochas que produce. Así en el aula máquina-herramienta (ETSIB) se han intentado demostrar los beneficios que tiene el torneado en duro en el mecanizado del ASP-23. Hoy en día, con el rápido desarrollo de nuevos materiales, los procesos de fabricación se están haciendo cada vez más complejos, por la amplia variedad de maquinas con las que se realizan los procesos, por la variedad de geometría/material de las herramientas empleadas, por las propiedades del material de la pieza a mecanizar, por los parámetros de corte tan variados con los que podemos implementar el proceso (profundidad de corte, velocidad, alimentación...) y por la diversidad de elementos de sujeción utilizados. Además debemos ser conscientes de que tal variedad implica grandes magnitudes de deformaciones, velocidades y temperaturas. He aquí la justificación y el gran interés en el proyecto a realizar. Por ello, en este proyecto intentamos dar un pequeño paso en el conocimiento del proceso del torneado en duro de aceros con poca maquinabilidad, siendo conscientes de la amplia variedad y dificultad del avance en la ingeniería de fabricación y del mucho trabajo que queda por hacer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.

In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.

Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.

In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.