A unified framework for linear function approximation of value functions in stochastic control


Autoria(s): Sánchez Fernández, Matilde; Valcarcel Macua, Sergio; Zazo Bello, Santiago
Data(s)

2013

Resumo

This paper contributes with a unified formulation that merges previ- ous analysis on the prediction of the performance ( value function ) of certain sequence of actions ( policy ) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approxi- mated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the pro- posed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.

Formato

application/pdf

Identificador

http://oa.upm.es/28942/

Idioma(s)

eng

Publicador

E.T.S.I. Telecomunicación (UPM)

Relação

http://oa.upm.es/28942/1/INVE_MEM_2013_166531.pdf

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6811729&tag=1

info:eu-repo/semantics/altIdentifier/doi/null

Direitos

http://creativecommons.org/licenses/by-nc-nd/3.0/es/

info:eu-repo/semantics/openAccess

Fonte

21st European Signal Processing Conference (EUSIPCO) | 21st European Signal Processing Conference (EUSIPCO) | 09/09/2013 - 13/09/2013 | Marrakech, Morocco

Palavras-Chave #Telecomunicaciones
Tipo

info:eu-repo/semantics/conferenceObject

Ponencia en Congreso o Jornada

PeerReviewed