132 resultados para hidden Markov Chain
Resumo:
This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
Many problems in control and signal processing can be formulated as sequential decision problems for general state space models. However, except for some simple models one cannot obtain analytical solutions and has to resort to approximation. In this thesis, we have investigated problems where Sequential Monte Carlo (SMC) methods can be combined with a gradient based search to provide solutions to online optimisation problems. We summarise the main contributions of the thesis as follows. Chapter 4 focuses on solving the sensor scheduling problem when cast as a controlled Hidden Markov Model. We consider the case in which the state, observation and action spaces are continuous. This general case is important as it is the natural framework for many applications. In sensor scheduling, our aim is to minimise the variance of the estimation error of the hidden state with respect to the action sequence. We present a novel SMC method that uses a stochastic gradient algorithm to find optimal actions. This is in contrast to existing works in the literature that only solve approximations to the original problem. In Chapter 5 we presented how an SMC can be used to solve a risk sensitive control problem. We adopt the use of the Feynman-Kac representation of a controlled Markov chain flow and exploit the properties of the logarithmic Lyapunov exponent, which lead to a policy gradient solution for the parameterised problem. The resulting SMC algorithm follows a similar structure with the Recursive Maximum Likelihood(RML) algorithm for online parameter estimation. In Chapters 6, 7 and 8, dynamic Graphical models were combined with with state space models for the purpose of online decentralised inference. We have concentrated more on the distributed parameter estimation problem using two Maximum Likelihood techniques, namely Recursive Maximum Likelihood (RML) and Expectation Maximization (EM). The resulting algorithms can be interpreted as an extension of the Belief Propagation (BP) algorithm to compute likelihood gradients. In order to design an SMC algorithm, in Chapter 8 uses a nonparametric approximations for Belief Propagation. The algorithms were successfully applied to solve the sensor localisation problem for sensor networks of small and medium size.
Resumo:
Deep belief networks are a powerful way to model complex probability distributions. However, learning the structure of a belief network, particularly one with hidden units, is difficult. The Indian buffet process has been used as a nonparametric Bayesian prior on the directed structure of a belief network with a single infinitely wide hidden layer. In this paper, we introduce the cascading Indian buffet process (CIBP), which provides a nonparametric prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network so each unit can additionally vary its behavior between discrete and continuous representations. We provide Markov chain Monte Carlo algorithms for inference in these belief networks and explore the structures learned on several image data sets.
Resumo:
In this paper, we consider Bayesian interpolation and parameter estimation in a dynamic sinusoidal model. This model is more flexible than the static sinusoidal model since it enables the amplitudes and phases of the sinusoids to be time-varying. For the dynamic sinusoidal model, we derive a Bayesian inference scheme for the missing observations, hidden states and model parameters of the dynamic model. The inference scheme is based on a Markov chain Monte Carlo method known as Gibbs sampler. We illustrate the performance of the inference scheme to the application of packet-loss concealment of lost audio and speech packets. © EURASIP, 2010.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
We present the Gaussian process density sampler (GPDS), an exchangeable generative model for use in nonparametric Bayesian density estimation. Samples drawn from the GPDS are consistent with exact, independent samples from a distribution defined by a density that is a transformation of a function drawn from a Gaussian process prior. Our formulation allows us to infer an unknown density from data using Markov chain Monte Carlo, which gives samples from the posterior distribution over density functions and from the predictive distribution on data space. We describe two such MCMC methods. Both methods also allow inference of the hyperparameters of the Gaussian process.
Resumo:
Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.
Resumo:
In this paper, we describe models and algorithms for detection and tracking of group and individual targets. We develop two novel group dynamical models, within a continuous time setting, that aim to mimic behavioural properties of groups. We also describe two possible ways of modeling interactions between closely using Markov Random Field (MRF) and repulsive forces. These can be combined together with a group structure transition model to create realistic evolving group models. We use a Markov Chain Monte Carlo (MCMC)-Particles Algorithm to perform sequential inference. Computer simulations demonstrate the ability of the algorithm to detect and track targets within groups, as well as infer the correct group structure over time. ©2008 IEEE.
Resumo:
We consider the problem of blind multiuser detection. We adopt a Bayesian approach where unknown parameters are considered random and integrated out. Computing the maximum a posteriori estimate of the input data sequence requires solving a combinatorial optimization problem. We propose here to apply the Cross-Entropy method recently introduced by Rubinstein. The performance of cross-entropy is compared to Markov chain Monte Carlo. For similar Bit Error Rate performance, we demonstrate that Cross-Entropy outperforms a generic Markov chain Monte Carlo method in terms of operation time.
Resumo:
We use reversible jump Markov chain Monte Carlo (MCMC) methods to address the problem of model order uncertainty in autoregressive (AR) time series within a Bayesian framework. Efficient model jumping is achieved by proposing model space moves from the full conditional density for the AR parameters, which is obtained analytically. This is compared with an alternative method, for which the moves are cheaper to compute, in which proposals are made only for new parameters in each move. Results are presented for both synthetic and audio time series.
Resumo:
We present methods for fixed-lag smoothing using Sequential Importance sampling (SIS) on a discrete non-linear, non-Gaussian state space system with unknown parameters. Our particular application is in the field of digital communication systems. Each input data point is taken from a finite set of symbols. We represent transmission media as a fixed filter with a finite impulse response (FIR), hence a discrete state-space system is formed. Conventional Markov chain Monte Carlo (MCMC) techniques such as the Gibbs sampler are unsuitable for this task because they can only perform processing on a batch of data. Data arrives sequentially, so it would seem sensible to process it in this way. In addition, many communication systems are interactive, so there is a maximum level of latency that can be tolerated before a symbol is decoded. We will demonstrate this method by simulation and compare its performance to existing techniques.
Resumo:
In this paper we address the problem of the separation and recovery of convolutively mixed autoregressive processes in a Bayesian framework. Solving this problem requires the ability to solve integration and/or optimization problems of complicated posterior distributions. We thus propose efficient stochastic algorithms based on Markov chain Monte Carlo (MCMC) methods. We present three algorithms. The first one is a classical Gibbs sampler that generates samples from the posterior distribution. The two other algorithms are stochastic optimization algorithms that allow to optimize either the marginal distribution of the sources, or the marginal distribution of the parameters of the sources and mixing filters, conditional upon the observation. Simulations are presented.