38 resultados para on-line condition monitoring


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The fastest ever 11.25Gb/s real-time FPGA-based optical orthogonal frequency division multiplexing (OOFDM) transceivers utilizing 64-QAM encoding/decoding and significantly improved variable power loading are experimentally demonstrated, for the first time, incorporating advanced functionalities of on-line performance monitoring, live system parameter optimization and channel estimation. Real-time end-to-end transmission of an 11.25Gb/s 64-QAM-encoded OOFDM signal with a high electrical spectral efficiency of 5.625bit/s/Hz over 25km of standard and MetroCor single-mode fibres is successfully achieved with respective power penalties of 0.3dB and -0.2dB at a BER of 1.0 x 10(-3) in a directly modulated DFB laser-based intensity modulation and direct detection system without in-line optical amplification and chromatic dispersion compensation. The impacts of variable power loading as well as electrical and optical components on the transmission performance of the demonstrated transceivers are experimentally explored in detail. In addition, numerical simulations also show that variable power loading is an extremely effective means of escalating system performance to its maximum potential.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cascaded 4×4 SOA switches with on-chip power monitoring exhibit potential for lowpower 16×16 integrated switches. Cascaded operation at 10Gbit/s with an IPDR of 8.5dB and 79% lower power consumption than equivalent all-active switches is reported © 2013 OSA.