Biblioteca Digital

**Autoria(s):** Valcarcel Macua, Sergio; Belanovic, Pavle; Zazo Bello, Santiago
Data(s)	01/05/2012
Resumo	We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global statevalue function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
Formato	application/pdf
Identificador	http://oa.upm.es/20234/
Idioma(s)	eng
Publicador	E.T.S.I. Telecomunicación (UPM)
Relação	http://oa.upm.es/20234/1/INVE_MEM_2012_137146.pdf http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6232901 info:eu-repo/semantics/altIdentifier/doi/null
Direitos	http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess
Fonte	3rd International Workshop on Cognitive Incromation Processing (CIP) \| 3rd International Workshop on Cognitive Incromation Processing (CIP) \| 28/05/2012 - 30/05/2012 \| Baiona
Palavras-Chave	#Telecomunicaciones #Robótica e Informática Industrial
Tipo	info:eu-repo/semantics/conferenceObject Ponencia en Congreso o Jornada PeerReviewed

Acesso ao item digital