Biblioteca Digital

**Autoria(s):** Prashanth, LA; Bhatnagar, Shalabh
Data(s)	18/11/2011
Resumo	We propose for the first time two reinforcement learning algorithms with function approximation for average cost adaptive control of traffic lights. One of these algorithms is a version of Q-learning with function approximation while the other is a policy gradient actor-critic algorithm that incorporates multi-timescale stochastic approximation. We show performance comparisons on various network settings of these algorithms with a range of fixed timing algorithms, as well as a Q-learning algorithm with full state representation that we also implement. We observe that whereas (as expected) on a two-junction corridor, the full state representation algorithm shows the best results, this algorithm is not implementable on larger road networks. The algorithm PG-AC-TLC that we propose is seen to show the best overall performance.
Formato	application/pdf
Identificador	http://eprints.iisc.ernet.in/42822/1/Reinforcement_Learning.pdf Prashanth, LA and Bhatnagar, Shalabh (2011) Reinforcement learning with average cost for adaptive control of traffic lights at intersections. In: 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), 5-7 Oct. 2011, Washington, DC, USA.
Publicador	IEEE
Relação	http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6082823 http://eprints.iisc.ernet.in/42822/
Palavras-Chave	#Computer Science & Automation (Formerly, School of Automation)
Tipo	Conference Paper PeerReviewed

Acesso ao item digital