2 resultados para Petitpied, Nicolas.

em Cambridge University Engineering Department Publications Database


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Optimization on manifolds is a rapidly developing branch of nonlinear optimization. Its focus is on problems where the smooth geometry of the search space can be leveraged to design effcient numerical algorithms. In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints. Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on. The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms. By dealing internally with most of the differential geometry, the package aims particularly at lowering the entrance barrier. © 2014 Nicolas Boumal.