1 resultado para Stochastic exponential stabilities
em Massachusetts Institute of Technology
Filtro por publicador
- Aberdeen University (2)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (5)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (10)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (10)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (4)
- Archive of European Integration (1)
- Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco (1)
- Aston University Research Archive (45)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (26)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (120)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (56)
- Brock University, Canada (3)
- Bulgarian Digital Mathematics Library at IMI-BAS (27)
- CentAUR: Central Archive University of Reading - UK (84)
- CiencIPCA - Instituto Politécnico do Cávado e do Ave, Portugal (1)
- Cochin University of Science & Technology (CUSAT), India (19)
- Collection Of Biostatistics Research Archive (2)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (90)
- Corvinus Research Archive - The institutional repository for the Corvinus University of Budapest (6)
- CUNY Academic Works (3)
- Dalarna University College Electronic Archive (1)
- Department of Computer Science E-Repository - King's College London, Strand, London (6)
- Digital Commons - Michigan Tech (3)
- Digital Commons at Florida International University (5)
- Digital Peer Publishing (2)
- DigitalCommons@The Texas Medical Center (5)
- DigitalCommons@University of Nebraska - Lincoln (1)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (15)
- DRUM (Digital Repository at the University of Maryland) (2)
- Duke University (5)
- FUNDAJ - Fundação Joaquim Nabuco (1)
- Glasgow Theses Service (1)
- Greenwich Academic Literature Archive - UK (1)
- Illinois Digital Environment for Access to Learning and Scholarship Repository (1)
- Instituto Politécnico do Porto, Portugal (6)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (1)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (3)
- Martin Luther Universitat Halle Wittenberg, Germany (4)
- Massachusetts Institute of Technology (1)
- National Center for Biotechnology Information - NCBI (14)
- Nottingham eTheses (3)
- Publishing Network for Geoscientific & Environmental Data (1)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (3)
- Repositório Aberto da Universidade Aberta de Portugal (1)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (6)
- Repositório da Produção Científica e Intelectual da Unicamp (5)
- Repositório digital da Fundação Getúlio Vargas - FGV (20)
- Repositório Institucional da Universidade Federal do Rio Grande - FURG (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (58)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (9)
- Scielo Saúde Pública - SP (8)
- Scottish Institute for Research in Economics (SIRE) (SIRE), United Kingdom (4)
- Universidad de Alicante (4)
- Universidad del Rosario, Colombia (3)
- Universidad Politécnica de Madrid (17)
- Universidade Complutense de Madrid (6)
- Universidade do Minho (2)
- Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP) (1)
- Universidade Técnica de Lisboa (1)
- Université de Lausanne, Switzerland (28)
- Université de Montréal (2)
- Université de Montréal, Canada (17)
- University of Connecticut - USA (4)
- University of Michigan (21)
- University of Queensland eSpace - Australia (109)
- University of Southampton, United Kingdom (1)
- University of Washington (3)
Resumo:
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.