26 resultados para On-Line Learning Resources


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of face recognition by matching image sets. Each set of face images is represented by a subspace (or linear manifold) and recognition is carried out by subspace-to-subspace matching. In this paper, 1) a new discriminative method that maximises orthogonality between subspaces is proposed. The method improves the discrimination power of the subspace angle based face recognition method by maximizing the angles between different classes. 2) We propose a method for on-line updating the discriminative subspaces as a mechanism for continuously improving recognition accuracy. 3) A further enhancement called locally orthogonal subspace method is presented to maximise the orthogonality between competing classes. Experiments using 700 face image sets have shown that the proposed method outperforms relevant prior art and effectively boosts its accuracy by online learning. It is shown that the method for online learning delivers the same solution as the batch computation at far lower computational cost and the locally orthogonal method exhibits improved accuracy. We also demonstrate the merit of the proposed face recognition method on portal scenarios of multiple biometric grand challenge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.