1 resultado para stochastic volatility diffusions
em Massachusetts Institute of Technology
Filtro por publicador
- Repository Napier (1)
- Aberystwyth University Repository - Reino Unido (3)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (6)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (7)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (9)
- Aquatic Commons (1)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (6)
- Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco (17)
- Aston University Research Archive (2)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (17)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (11)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (19)
- Boston University Digital Common (3)
- Brock University, Canada (6)
- Bulgarian Digital Mathematics Library at IMI-BAS (1)
- CaltechTHESIS (6)
- Cambridge University Engineering Department Publications Database (77)
- CentAUR: Central Archive University of Reading - UK (99)
- Chinese Academy of Sciences Institutional Repositories Grid Portal (25)
- Cochin University of Science & Technology (CUSAT), India (14)
- Comissão Econômica para a América Latina e o Caribe (CEPAL) (11)
- CORA - Cork Open Research Archive - University College Cork - Ireland (3)
- CUNY Academic Works (3)
- Dalarna University College Electronic Archive (1)
- Deakin Research Online - Australia (101)
- Department of Computer Science E-Repository - King's College London, Strand, London (6)
- Digital Commons at Florida International University (6)
- DigitalCommons@University of Nebraska - Lincoln (1)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (1)
- Duke University (14)
- eResearch Archive - Queensland Department of Agriculture; Fisheries and Forestry (6)
- Glasgow Theses Service (1)
- Greenwich Academic Literature Archive - UK (8)
- Helda - Digital Repository of University of Helsinki (29)
- Indian Institute of Science - Bangalore - Índia (115)
- Instituto Politécnico do Porto, Portugal (3)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (2)
- Massachusetts Institute of Technology (1)
- Plymouth Marine Science Electronic Archive (PlyMSEA) (2)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (50)
- Queensland University of Technology - ePrints Archive (149)
- RDBU - Repositório Digital da Biblioteca da Unisinos (1)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (2)
- Repositório digital da Fundação Getúlio Vargas - FGV (48)
- Repositório Institucional da Universidade de Aveiro - Portugal (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (39)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (6)
- SAPIENTIA - Universidade do Algarve - Portugal (1)
- The Scholarly Commons | School of Hotel Administration; Cornell University Research (1)
- Universidad del Rosario, Colombia (5)
- Universidade Complutense de Madrid (2)
- Université de Lausanne, Switzerland (1)
- Université de Montréal, Canada (35)
- University of Queensland eSpace - Australia (1)
- University of Southampton, United Kingdom (3)
- University of Washington (1)
- WestminsterResearch - UK (3)
Resumo:
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.