979 resultados para Markov processes
Resumo:
In this work the state of the art of the automatic dialogue strategy management using Markov decision processes (MDP) with reinforcement learning (RL) is described. Partially observable Markov decision processes (POMDP) are also described. To test the validity of these methods, two spoken dialogue systems have been developed. The first one is a spoken dialogue system for weather forecast providing, and the second one is a more complex system for train information. With the first system, comparisons between a rule-based system and an automatically trained system have been done, using a real corpus to train the automatic strategy. In the second system, the scalability of these methods when used in larger systems has been tested.
Resumo:
Given a spectral density matrix or, equivalently, a real autocovariance sequence, the author seeks to determine a finite-dimensional linear time-invariant system which, when driven by white noise, will produce an output whose spectral density is approximately PHI ( omega ), and an approximate spectral factor of PHI ( omega ). The author employs the Anderson-Faurre theory in his analysis.
Resumo:
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
Attention has recently focussed on stochastic population processes that can undergo total annihilation followed by immigration into state j at rate αj. The investigation of such models, called Markov branching processes with instantaneous immigration (MBPII), involves the study of existence and recurrence properties. However, results developed to date are generally opaque, and so the primary motivation of this paper is to construct conditions that are far easier to apply in practice. These turn out to be identical to the conditions for positive recurrence, which are very easy to check. We obtain, as a consequence, the surprising result that any MBPII that exists is ergodic, and so must possess an equilibrium distribution. These results are then extended to more general MBPII, and we show how to construct the associated equilibrium distributions.
Resumo:
A generalized Markov Brnching Process (GMBP) is a Markov branching model where the infinitesimal branching rates are modified with an interaction index. It is proved that there always exists only one GMBP. An associated differential-integral equation is derived. The extinction probalility and the mean and conditional mean extinction times are obtained. Ergodicity and stability of GMBP with resurrection are also considered. Easy checking criteria are established for ordinary and strong ergodicty. The equilibrium distribution is given in an elegant closed form. The probability meaning of our results is clear and thus explained.
Resumo:
This paper focuses on the basic problems regarding uniqueness and extinction properties for generalised Markov branching processes. The uniqueness criterion is firstly established and a differential–integral equation satisfied by the transition functions of such processes is derived. The extinction probability is then obtained. A closed form is presented for both the mean extinction time and the conditional mean extinction time. It turns out that these important quantities are closely related to the elementary gamma function.
Resumo:
This paper concentrates on investigating ergodicity and stability for generalised Markov branching processes with resurrection. Easy checking criteria including several clear-cut corollaries are established for ordinary and strong ergodicity of such processes. The equilibrium distribution is given in an elegant closed form for the ergodic case. The probabilistic interpretation of the results is clear and thus explained.