Biblioteca Digital

**Autoria(s):** Wang, Y-G.
Data(s)	1997
Resumo	For a wide class of semi-Markov decision processes the optimal policies are expressible in terms of the Gittins indices, which have been found useful in sequential clinical trials and pharmaceutical research planning. In general, the indices can be approximated via calibration based on dynamic programming of finite horizon. This paper provides some results on the accuracy of such approximations, and, in particular, gives the error bounds for some well known processes (Bernoulli reward processes, normal reward processes and exponential target processes).
Identificador	http://eprints.qut.edu.au/90622/
Publicador	Wiley-Blackwell Publishing Asia
Relação	DOI:10.1111/j.1467-842X.1997.tb00538.x Wang, Y-G. (1997) Error bounds for calculation of the Gittins indices. Australian Journal of Statistics, 39(2), pp. 225-233.
Direitos	Copyright CSIRO
Fonte	Science & Engineering Faculty
Palavras-Chave	#bandit process #clinical trials #dynamic programming #stopping time #multi-armed bandits #dynamic allocation indexes #delayed-responses #clinical-trials
Tipo	Journal Article

Acesso ao item digital