Biblioteca Digital

3 resultados para Stochastic simulation algorithm

em Massachusetts Institute of Technology

On the Convergence of Stochastic Iterative Dynamic Programming Algorithms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.

Veja mais

Hierarchical Mixtures of Experts and the EM Algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.

Veja mais

A Conservative Front Tracking Algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The discontinuities in the solutions of systems of conservation laws are widely considered as one of the difficulties in numerical simulation. A numerical method is proposed for solving these partial differential equations with discontinuities in the solution. The method is able to track these sharp discontinuities or interfaces while still fully maintain the conservation property. The motion of the front is obtained by solving a Riemann problem based on the state values at its both sides which are reconstructed by using weighted essentially non oscillatory (WENO) scheme. The propagation of the front is coupled with the evaluation of "dynamic" numerical fluxes. Some numerical tests in 1D and preliminary results in 2D are presented.

Veja mais

3 resultados para Stochastic simulation algorithm

em Massachusetts Institute of Technology

Filtro por publicador

On the Convergence of Stochastic Iterative Dynamic Programming Algorithms

Hierarchical Mixtures of Experts and the EM Algorithm

A Conservative Front Tracking Algorithm