5 resultados para optimal stopping rule

em Aston University Research Archive


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a framework for calculating globally optimal parameters, within a given time frame, for on-line learning in multilayer neural networks. We demonstrate the capability of this method by computing optimal learning rates in typical learning scenarios. A similar treatment allows one to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule as well as to compare different training methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This rule maximizes the total reduction in generalization error over the whole learning process. A simple example demonstrates that the locally optimal rule, which maximizes the rate of decrease in generalization error, may perform poorly in comparison.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for determining the globally optimal on-line learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule.