Error bounds for calculation of the Gittins indices


Autoria(s): Wang, Y-G.
Data(s)

1997

Resumo

For a wide class of semi-Markov decision processes the optimal policies are expressible in terms of the Gittins indices, which have been found useful in sequential clinical trials and pharmaceutical research planning. In general, the indices can be approximated via calibration based on dynamic programming of finite horizon. This paper provides some results on the accuracy of such approximations, and, in particular, gives the error bounds for some well known processes (Bernoulli reward processes, normal reward processes and exponential target processes).

Identificador

http://eprints.qut.edu.au/90622/

Publicador

Wiley-Blackwell Publishing Asia

Relação

DOI:10.1111/j.1467-842X.1997.tb00538.x

Wang, Y-G. (1997) Error bounds for calculation of the Gittins indices. Australian Journal of Statistics, 39(2), pp. 225-233.

Direitos

Copyright CSIRO

Fonte

Science & Engineering Faculty

Palavras-Chave #bandit process #clinical trials #dynamic programming #stopping time #multi-armed bandits #dynamic allocation indexes #delayed-responses #clinical-trials
Tipo

Journal Article