830 resultados para on line teacher
Resumo:
Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.
Resumo:
The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
The performance of the current sensor in power equipment may become worse affected by the environment. In this paper, based on ICA, we propose a method for on-line verification of the phase difference of the current sensor. However, not all source components are mutually independent in our application. In order to get an exact result, we have proposed a relative likelihood index to choose an optimal result from different runs. The index is based on the maximum likelihood evaluation theory and the independent subspace analysis. The feasibility of our method has been confirmed by experimental results.
Resumo:
The performance of the current sensor in power equipment may become worse affected by the environment. In this paper, based on ICA, we propose a method for on-line verification of the phase difference of the current sensor. However, not all source components are mutually independent in our application. In order to get an exact result, we have proposed a relative likelihood index to choose an optimal result from different runs. The index is based on the maximum likelihood evaluation theory and the independent subspace analysis. The feasibility of our method has been confirmed by experimental results.
Resumo:
The construction and evaluation of an on-column etched fused-silica porous junction for on-line coupling of capillary isoelectric focusing (CIEF) with capillary zone electrophoresis (CZE) are described. Where two separation columns were integrated on a single piece of fused-silica capillary through the etched similar to4 to 5-mm length porous junction along the capillary. The junction is easily prepared by etching a short section of the capillary wall with HF after removing the polyimide coating. The etched section becomes a porous glass membrane that allows only small ions related to the background electrolyte to pass through when high voltage is applied across the separation capillary. The primary advantages of this novel porous junction interface over previous designs (in which the interface is usually formed by fracturing the capillary followed by connecting the two capillaries with a section of microdialysis hollow fiber membrane) are no dead volume, simplicity, and ruggedness, which is particularly well suited for an on-line coupling capillary electrophoresis-based multiple dimensional separation system. The performance of the 2D CIEF-CZE system constructed by such an etched porous junction was evaluated by the analyses of protein mixtures.
Resumo:
A method involving self-concentration, on-column enrichment and field-amplified sample stacking for on-line concentration in capillary electrochromatography with a polymer monolithic column is presented. Since monolithic columns eliminate the frit fabrication and the problems associated with frits, the experimental conditions could be more flexibly adjusted to obtain higher concentration factor in comparison with conventional particulate packed columns. With self-concentration effect, the detection sensitivity of benzene and hexylbenzene is improved by a factor of 4 and 8, respectively. With on-column enrichment and ultralong injection, improvement as high as 22 000 times in detection sensitivity of benzoin is achieved. Furthermore, a combination of the three above-mentioned methods yields up to a 24000-fold improvement in detection sensitivity for caffeine, a charged compound. Parameters affecting the efficiency of on-line concentration are investigated systematically. In addition, equations describing on-line concentration process are deduced.
Resumo:
An on-line sample introduction technique in capillary gas chromatograph (CGC) for the analysis of high-pressure gas-liquid mixtures has been designed and evaluated. A sample loop of 0.05 muL and a washing solvent loop of 0.5 muL are mounted on a 10-port switching valve, which serves as the injection valve. A capillary resistor was connected to the vent of sample loop in order to maintain the pressure of the sample. Both the sample and the washing solvent are transferred into the split-injection port through a narrow bore fused silica capillary inserted into the injection liner through a septum. The volume of the liner is used both as the pressure-release damper and evaporation chamber of the sample. On-line analysis of both reactants and resultants in ethylene olimer reaction mixture at 5 MPa was carried out, which demonstrated the applicability of the technique. (C) 2004 Elsevier B.V. All rights reserved.