Biblioteca Digital

52 resultados para Markov Decision Process

Semi Markov decision process with partial information for maintenance decisions

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Semi-Markov Decision Process With Partial Information for Maintenance Decisions

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Bayesian learning of noisy Markov decision processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Veja mais

Partially Observable Markov Decision Processes with continuous observations for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions.

Veja mais

Bayesian Learning of Noisy Markov Decision Processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Veja mais

Stable Markov decision processes using simulation based predictive control

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Partially observable Markov decision processes with continuous observations for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Veja mais

A policy gradient method for semi-Markov decision processes with application to call admission control

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Factored partially observable Markov decision processes for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Partially observable Markov decision processes with continuous observations for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Two time-scale stochastic approximation for constrained stochastic optimization and constrained Markov decision problems

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Hidden Markov decision trees

Relevância:

100.00% 100.00%

Publicador:

Veja mais

A policy gradient method for semi-Markov decision processes with application to call admission control

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Partially observable Markov decision processes for spoken dialog systems

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Parameter learning for POMDP spoken dialogue models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The partially observable Markov decision process (POMDP) provides a popular framework for modelling spoken dialogue. This paper describes how the expectation propagation algorithm (EP) can be used to learn the parameters of the POMDP user model. Various special probability factors applicable to this task are presented, which allow the parameters be to learned when the structure of the dialogue is complex. No annotations, neither the true dialogue state nor the true semantics of user utterances, are required. Parameters optimised using the proposed techniques are shown to improve the performance of both offline transcription experiments as well as simulated dialogue management performance. ©2010 IEEE.

Veja mais

52 resultados para Markov Decision Process

Filtro por publicador