5 resultados para locally weighted learning
em Aston University Research Archive
Resumo:
Neural networks are usually curved statistical models. They do not have finite dimensional sufficient statistics, so on-line learning on the model itself inevitably loses information. In this paper we propose a new scheme for training curved models, inspired by the ideas of ancillary statistics and adaptive critics. At each point estimate an auxiliary flat model (exponential family) is built to locally accommodate both the usual statistic (tangent to the model) and an ancillary statistic (normal to the model). The auxiliary model plays a role in determining credit assignment analogous to that played by an adaptive critic in solving temporal problems. The method is illustrated with the Cauchy model and the algorithm is proved to be asymptotically efficient.
Resumo:
We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This rule maximizes the total reduction in generalization error over the whole learning process. A simple example demonstrates that the locally optimal rule, which maximizes the rate of decrease in generalization error, may perform poorly in comparison.
Resumo:
We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule.
Resumo:
In this paper, we develop a new entropic matching kernel for weighted graphs by aligning depth-based representations. We demonstrate that this kernel can be seen as an aligned subtree kernel that incorporates explicit subtree correspondences, and thus addresses the drawback of neglecting the relative locations between substructures that arises in the R-convolution kernels. Experiments on standard datasets demonstrate that our kernel can easily outperform state-of-the-art graph kernels in terms of classification accuracy.
Resumo:
The focus of this thesis is the extension of topographic visualisation mappings to allow for the incorporation of uncertainty. Few visualisation algorithms in the literature are capable of mapping uncertain data with fewer able to represent observation uncertainties in visualisations. As such, modifications are made to NeuroScale, Locally Linear Embedding, Isomap and Laplacian Eigenmaps to incorporate uncertainty in the observation and visualisation spaces. The proposed mappings are then called Normally-distributed NeuroScale (N-NS), T-distributed NeuroScale (T-NS), Probabilistic LLE (PLLE), Probabilistic Isomap (PIso) and Probabilistic Weighted Neighbourhood Mapping (PWNM). These algorithms generate a probabilistic visualisation space with each latent visualised point transformed to a multivariate Gaussian or T-distribution, using a feed-forward RBF network. Two types of uncertainty are then characterised dependent on the data and mapping procedure. Data dependent uncertainty is the inherent observation uncertainty. Whereas, mapping uncertainty is defined by the Fisher Information of a visualised distribution. This indicates how well the data has been interpolated, offering a level of ‘surprise’ for each observation. These new probabilistic mappings are tested on three datasets of vectorial observations and three datasets of real world time series observations for anomaly detection. In order to visualise the time series data, a method for analysing observed signals and noise distributions, Residual Modelling, is introduced. The performance of the new algorithms on the tested datasets is compared qualitatively with the latent space generated by the Gaussian Process Latent Variable Model (GPLVM). A quantitative comparison using existing evaluation measures from the literature allows performance of each mapping function to be compared. Finally, the mapping uncertainty measure is combined with NeuroScale to build a deep learning classifier, the Cascading RBF. This new structure is tested on the MNist dataset achieving world record performance whilst avoiding the flaws seen in other Deep Learning Machines.