851 resultados para On-line Analysis
Resumo:
Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.
Resumo:
The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
The complete cytochrome b and the control region of mtDNA (about 2070 bp in total) of 10 strains belonging to three subspecies of the common carp, including three wild subspecies (the Yangtze River wild common carp - Cyprinus carpio haematopterus, Yuanjiang River wild common carp Cyprinus carpio rubrofuscus and Volga River wild common carp - Cyprinus carpio carpio) and seven domestic strains (Xingguo red carp, Russian scattered scaled mirror carp, Qingtian carp, Japanese Koi carp, purse red carp, Big-belly carp, German mirror carp) were sequenced. Phylogenetic analysis indicated that the 10 strains form three distinct clades, corresponding to C. c. haematopterus, C. c. rubrofuscus and C. c. carpio respectively. Purse red carp, an endemic domestic strain in Jiangxi province of China, showed a higher evolution rate in comparison with the other strains of C. c. haematopterus, most probably because of intensive selection and a long history of domestication. Base variation ratios among the three subspecies varied from 0.78% (between C. c. haematopterus and C. c. rubrofuscus) to 1.47%(between C. c. carpio and C. c. rubrofuscus). The topography of the phylogenetic tree and the geographic distribution of three subspecies closely resemble each other. The divergence time between C. c. carpio and the other two subspecies was estimated to be about 0.9 Myr and about 0.5 Myr between C. c. haematopterus and C. c. rubrofuscus. Based on phylogenetic analysis, C. c. rubrofuscus might have diverged from C. c. haematopterus.
Resumo:
Cyprinidae is the largest fish family in the world and contains about 210 genera and 2010 species. Appropriate DNA markers must be selected for the phylogenetic analyses of Cyprinidae. In present study, the 1st intron of the S7 ribosomal protein (r-protein) gene is first used to examine the relationships among cyprinid fishes. The length of the 1st intron obtained by PCR amplification ranges from 655 to 859 by in the 16 cyprinid species investigated, and is 602 by in Myxocyprinus asiaticus. Out of the alignment of 925 nucleotide sites obtained, the parsimony informative sites are 499 and occupy 54% of the total sites. The results indicate that the 1st intron sequences of the S7 r-protein gene in cyprinids are rich in informative sites and vary remarkably in sequence divergence from 2.3% between close species to 66.6% between distant species. The bootstrap values of the interior nodes in the NJ (neighbor-joining) and MP (most-parsimony) trees based on the present S7 r-protein gene data are higher than those based on cytochrome b and the d-loop region respectively. Therefore, the 1st intron sequences of the S7 r-protein gene in cyprinids are sensitive enough for phylogenetic analyses, and the 1st intron is an appropriate genetic marker for the phylogenetic reconstruction of the taxa in different cyprinid subfamilies. However, attempts to discuss whether the present S7 r-protein gene data can be applied to the phylogeny of the taxa at the level of the family or the higher categories in Cypriniformes need further studies.
Resumo:
We develop a swept frequency method for measuring the frequency response of photodetectors; (PDs) based on harmonic analysis. In this technique, a lightwave from a laser source is modulated by a radio-frequency (RF) signal via a Mach-Zehnder LiNbO3 modulator, and detected by a PD under test. The measured second-order harmonic of the RF signal contains information of the frequency responses and nonlinearities of the RF source, modulator, and PD. The frequency response of the PD alone is obtained by deducting the known frequency responses and nonlinearities of the RF source and modulator. Compared with the conventional swept frequency method, the measurement frequency range can be doubled using the proposed method. Experiment results show a good agreement between the measured results and those obtained using other techniques.