The actor-critic algorithm as multi-time-scale stochastic approximation


Autoria(s): Borkar, Vivek S; Konda, Vijaymohan R
Data(s)

01/08/1997

Resumo

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time Scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.

Formato

application/pdf

Identificador

http://eprints.iisc.ernet.in/38531/1/The_actor-critic_algorithm.pdf

Borkar, Vivek S and Konda, Vijaymohan R (1997) The actor-critic algorithm as multi-time-scale stochastic approximation. In: Sadhana : Academy Proceedings in Engineering Sciences, 22 (part 4). pp. 525-543.

Publicador

Indian Academy of Sciences

Relação

http://www.springerlink.com/content/y7j344885r08515m/

http://eprints.iisc.ernet.in/38531/

Palavras-Chave #Computer Science & Automation (Formerly, School of Automation)
Tipo

Journal Article

PeerReviewed