Biblioteca Digital

**Autoria(s):** Bhatnagar, Shalabh; Borkar, Vivek S; Prabuchandran, KJ
Data(s)	2013
Resumo	We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates.
Formato	application/pdf
Identificador	http://eprints.iisc.ernet.in/47567/1/Ieee_Jou_Sel_Top_Sig_Pro_7-5_746_2013.pdf Bhatnagar, Shalabh and Borkar, Vivek S and Prabuchandran, KJ (2013) Feature Search in the Grassmanian in Online Reinforcement Learning. In: IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 7 (5). pp. 746-758.
Publicador	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Relação	http://dx.doi.org/10.1109/JSTSP.2013.2255022 http://eprints.iisc.ernet.in/47567/
Palavras-Chave	#Computer Science & Automation (Formerly, School of Automation)
Tipo	Journal Article PeerReviewed

Acesso ao item digital