1 resultado para ARBITRARY MAGNITUDE
em Massachusetts Institute of Technology
Filtro por publicador
- Aberdeen University (1)
- Aberystwyth University Repository - Reino Unido (2)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (1)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (3)
- Aquatic Commons (6)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (1)
- Archive of European Integration (1)
- Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco (1)
- Aston University Research Archive (11)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (3)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (7)
- Biblioteca Digital de Teses e Dissertações Eletrônicas da UERJ (2)
- Bibloteca do Senado Federal do Brasil (2)
- Biodiversity Heritage Library, United States (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (26)
- Boston University Digital Common (8)
- Brock University, Canada (1)
- Bulgarian Digital Mathematics Library at IMI-BAS (5)
- CaltechTHESIS (7)
- Cambridge University Engineering Department Publications Database (22)
- CentAUR: Central Archive University of Reading - UK (24)
- Chinese Academy of Sciences Institutional Repositories Grid Portal (26)
- Cochin University of Science & Technology (CUSAT), India (1)
- Comissão Econômica para a América Latina e o Caribe (CEPAL) (4)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- Department of Computer Science E-Repository - King's College London, Strand, London (6)
- DI-fusion - The institutional repository of Université Libre de Bruxelles (1)
- Digital Commons - Michigan Tech (1)
- Digital Commons at Florida International University (3)
- Digital Peer Publishing (1)
- DigitalCommons@The Texas Medical Center (1)
- Diposit Digital de la UB - Universidade de Barcelona (1)
- Duke University (5)
- FUNDAJ - Fundação Joaquim Nabuco (1)
- Greenwich Academic Literature Archive - UK (1)
- Indian Institute of Science - Bangalore - Índia (65)
- INSTITUTO DE PESQUISAS ENERGÉTICAS E NUCLEARES (IPEN) - Repositório Digital da Produção Técnico Científica - BibliotecaTerezine Arantes Ferra (1)
- Instituto Politécnico do Porto, Portugal (2)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (2)
- Massachusetts Institute of Technology (1)
- National Center for Biotechnology Information - NCBI (4)
- Plymouth Marine Science Electronic Archive (PlyMSEA) (2)
- Publishing Network for Geoscientific & Environmental Data (7)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (40)
- Queensland University of Technology - ePrints Archive (550)
- RDBU - Repositório Digital da Biblioteca da Unisinos (1)
- RepoCLACAI - Consorcio Latinoamericano Contra el Aborto Inseguro (1)
- Repositório digital da Fundação Getúlio Vargas - FGV (2)
- Repositório Institucional da Universidade de Aveiro - Portugal (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (19)
- SAPIENTIA - Universidade do Algarve - Portugal (1)
- Universidad de Alicante (1)
- Universidad del Rosario, Colombia (1)
- Universidad Politécnica de Madrid (13)
- Universidade Complutense de Madrid (4)
- Universidade de Lisboa - Repositório Aberto (1)
- Universidade Federal do Rio Grande do Norte (UFRN) (3)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (3)
- Université de Montréal (1)
- Université de Montréal, Canada (4)
- University of Connecticut - USA (3)
- University of Michigan (23)
- University of Queensland eSpace - Australia (12)
- University of Washington (2)
- Worcester Research and Publications - Worcester Research and Publications - UK (1)
Resumo:
We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowle ge of the POMDP and allows the experience to be gathered with an arbitrary set of policies. The return is estimated for any new policy of the POMDP. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its theoretical properties. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons.We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to the REINFORCE algorithm showing an order of magnitude reduction in the number of trials required.