New Techniques in Deep Representation Learning


Autoria(s): Andrew, Galen Michael
Contribuinte(s)

Todorov, Emanuel

Data(s)

14/07/2016

14/07/2016

01/06/2016

Resumo

Thesis (Ph.D.)--University of Washington, 2016-06

The choice of feature representation can have a large impact on the success of a machine learning algorithm at solving a given problem. Although human engineers employing task-specific domain knowledge still play a key role in feature engineering, automated domain-independent algorithms, in particular methods from the area of {\em deep learning}, are proving more and more useful on a variety of difficult tasks, including speech recognition, image analysis, natural language processing, and game playing. This document describes three new techniques for automated domain-independent deep representation learning: Sequential deep neural networks (SDNN) learn representations of data that is continuously extended in time such as audio. Unlike ``sliding window'' neural networks applied to such data or convolutional neural networks, SDNNs are capable of capturing temporal patterns of arbitrary span, and can encode that discovered features should exhibit greater or lesser degrees of continuity through time. Deep canonical correlation analysis (DCCA) is a method to learn parametric nonlinear transformations of multiview data that capture latent shared aspects of the views so that the learned representation of each view is maximally predictive of (and predicted by) the other. DCCA may be able to learn to represent abstract properties when the two views are not superficially related. The orthant-wise limited-memory quasi-Newton algorithm (OWL-QN) can be employed to train any parametric representation mapping to produce parameters that are sparse (mostly zero), resulting in more interpretable and more compact models. If the prior assumption that parameters should be sparse is reasonable for the data source, training with OWL-QN should also improve generalization. Experiments on many different tasks demonstrate that these new methods are computationally efficient relative to existing comparable methods, and often produce representations that yield improved performance on machine learning tasks.

Formato

application/pdf

Identificador

Andrew_washington_0250E_15613.pdf

http://hdl.handle.net/1773/36548

Idioma(s)

en_US

Palavras-Chave #artificial intelligence #deep learning #machine learning #neural networks #representation learning #Computer science #Artificial intelligence #computer science and engineering
Tipo

Thesis