987 resultados para Statistical efficiency
Resumo:
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estimate the parameters of a dialogue policy which selects the system's responses based on the inferred dialogue state. However, the inference of the dialogue state itself depends on a dialogue model which describes the expected behaviour of a user when interacting with the system. Ideally the parameters of this dialogue model should be also optimised to maximise the expected cumulative reward. This article presents two novel reinforcement algorithms for learning the parameters of a dialogue model. First, the Natural Belief Critic algorithm is designed to optimise the model parameters while the policy is kept fixed. This algorithm is suitable, for example, in systems using a handcrafted policy, perhaps prescribed by other design considerations. Second, the Natural Actor and Belief Critic algorithm jointly optimises both the model and the policy parameters. The algorithms are evaluated on a statistical dialogue system modelled as a Partially Observable Markov Decision Process in a tourist information domain. The evaluation is performed with a user simulator and with real users. The experiments indicate that model parameters estimated to maximise the expected reward function provide improved performance compared to the baseline handcrafted parameters. © 2011 Elsevier Ltd. All rights reserved.
Resumo:
Simultaneous high power (2W), high modulation speed (1Gb/s) and high modulation efficiency (14 W/A) operation of a two-electrode tapered laser is reported. © 2011 IEEE.
Resumo:
Recent work in the area of probabilistic user simulation for training statistical dialogue managers has investigated a new agenda-based user model and presented preliminary experiments with a handcrafted model parameter set. Training the model on dialogue data is an important next step, but non-trivial since the user agenda states are not observable in data and the space of possible states and state transitions is intractably large. This paper presents a summary-space mapping which greatly reduces the number of state transitions and introduces a tree-based method for representing the space of possible agenda state sequences. Treating the user agenda as a hidden variable, the forward/backward algorithm can then be successfully applied to iteratively estimate the model parameters on dialogue data. © 2007 Association for Computational Linguistics.
Resumo:
A great deal of experimental studies have shown that many introns of eukaryotic genes function as regulators of transcription. However, comprehensive studies of this problem have not yet been conducted. After checking the transcription frequencies of some Saccharomyces cerevisiae (yeast), genes and their introns, a remarkable phenomenon was discovered that generally the introns of the genes with higher transcription frequencies are longer, and the introns of the genes with lower transcription frequencies are shorter. This suggests that the longer introns of genes with higher transcription frequencies may contain some characteristic sequence structures, which could enhance the transcription of genes. Therefore, two sets of introns of yeast genes were chosen for further study. The transcription frequencies of the first set of genes are higher (>30), and those of the second set of genes are lower (less than or equal to10). Some oligonucleotides are detected by statistically comparative analyses of the occurrence frequencies of oligonucleotides (mainly tetranucleotides and pentanucleotides), whose occurrence frequencies in the first set of introns; are significantly higher than those in the second set of introns, and are also significantly higher than those in the exons flanking the introns of the first set. Some of these extracted oligonucleotides are the same as the regulatory elements of transcription revealed by experimental analyses. Besides, the distributions of these extracted oligonucleotides in the two sets of introns and the exons show that the sequence structures of the first set of introns are favorable for transcription of genes.
Resumo:
Anew integrated sequence-structure database, called IADE (Integrated ASTRAL-DSSP-EMBL), incorporating matching mRNA sequence, amino acid sequence, and protein secondary structural data, is constructed. It includes 648 protein domains. Based on the IADE database, we studied the relation between RNA stem-loop frequencies and protein secondary structure. It was found that the alpha-helices and beta-strands on proteins tend to be preferably "coded" by mRNA stem region, while the coils on proteins tend to be preferably "coded" by mRNA loop region. These tendencies are more obvious if we observe the structural words (SWs). An SW is defined by a four-amino-acid-fragment that shows the pronounced secondary structural (alpha-helix or beta-strand) propensity. It is demonstrated that the deduced correlation between protein and mRNA structure can hardly be explained as the stochastic fluctuation effect. (C) 2003 Wiley Periodicals, Inc.
Resumo:
Over the last two or three years, the increasing costs of energy and worsening market conditions have focussed even greater attention within paper mills than before, on considering ways to improve efficiency and reduce the energy used in paper making. Arising from a multivariable understanding of paper machine operation, Advanced Process Control (APC) technology enables paper machine behaviour to be controlled in a more coherent way, using all the variables available for control. Furthermore, with the machine under better regulation and with more variables used in control, there is the opportunity to optimise machine operation, usually providing very striking multi-objective performance improvement benefits of a number of kinds. Traditional three term control technology does not offer this capability. The paper presents results from several different paper machine projects we have undertaken around the world. These projects have been aimed at improving machine stability, optimising chemicals usage and reducing energy use. On a brown paperboard machine in Australasia, APC has reduced specific steam usage by 10%, averaged across the grades; the controller has also provided a significant capacity to increase production. On a North American newsprint machine, the APC system has reduced steam usage by more than 10%, and it provides better control of colour and much improved wet end stability. The paper also outlines early results from two other performance improvement projects, each incorporating a different approach to reducing the energy used in paper making. The first of these two projects is focussed on optimising sheet drainage, aiming to present the dryer with a sheet having higher solids content than before. The second project aims to reduce specific steam usage by optimising the operation of the dryer hood.
Resumo:
An increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. © 2012 IEEE.
Resumo:
Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation performance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data. © 2010 Association for Computational Linguistics.
Resumo:
This paper introduces a rule-based classification of single-word and compound verbs into a statistical machine translation approach. By substituting verb forms by the lemma of their head verb, the data sparseness problem caused by highly-inflected languages can be successfully addressed. On the other hand, the information of seen verb forms can be used to generate new translations for unseen verb forms. Translation results for an English to Spanish task are reported, producing a significant performance improvement.
Resumo:
In this paper a method to incorporate linguistic information regarding single-word and compound verbs is proposed, as a first step towards an SMT model based on linguistically-classified phrases. By substituting these verb structures by the base form of the head verb, we achieve a better statistical word alignment performance, and are able to better estimate the translation model and generalize to unseen verb forms during translation. Preliminary experiments for the English - Spanish language pair are performed, and future research lines are detailed. © 2005 Association for Computational Linguistics.