Biblioteca Digital

Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

**Autoria(s):** Fernández Gauna, Borja; Etxeberria Agiriano, Ismael; Graña Romay, Manuel María
Data(s)	11/04/2016 11/04/2016 09/07/2015
Resumo	Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.
Identificador	PLOS ONE 10(7) 2015 : (2015) // Article ID e0127129 1932-6203 http://hdl.handle.net/10810/17878 10.1371/journal.pone.0127129
Idioma(s)	eng
Publicador	Public Library Science
Relação	http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0127129#abstract0 info:eu-repo/grantAgreement/EC/FP7/317947
Direitos	© 2015 Fernandez-Gauna et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited info:eu-repo/semantics/openAccess
Palavras-Chave	#system control #reinforcement #constraints #MDPS
Tipo	info:eu-repo/semantics/article

Acesso ao item digital