Biblioteca Digital

444 resultados para multi-armed bandit

em Queensland University of Technology - ePrints Archive

Error bounds for calculation of the Gittins indices

Relevância:

90.00% 90.00%

Publicador:

Resumo:

For a wide class of semi-Markov decision processes the optimal policies are expressible in terms of the Gittins indices, which have been found useful in sequential clinical trials and pharmaceutical research planning. In general, the indices can be approximated via calibration based on dynamic programming of finite horizon. This paper provides some results on the accuracy of such approximations, and, in particular, gives the error bounds for some well known processes (Bernoulli reward processes, normal reward processes and exponential target processes).

Sequential allocation in clinical trials

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Suppose two treatments with binary responses are available for patients with some disease and that each patient will receive one of the two treatments. In this paper we consider the interests of patients both within and outside a trial using a Bayesian bandit approach and conclude that equal allocation is not appropriate for either group of patients. It is suggested that Gittins indices should be used (using an approach called dynamic discounting by choosing the discount rate based on the number of future patients in the trial) if the disease is rare, and the least failures rule if the disease is common. Some analytical and simulation results are provided.

Prediction with limited advice and multiarmed bandits with paid observations

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We study two problems of online learning under restricted information access. In the first problem, prediction with limited advice, we consider a game of prediction with expert advice, where on each round of the game we query the advice of a subset of M out of N experts. We present an algorithm that achieves O(√(N/M)TlnN ) regret on T rounds of this game. The second problem, the multiarmed bandit with paid observations, is a variant of the adversarial N-armed bandit game, where on round t of the game we can observe the reward of any number of arms, but each observation has a cost c. We present an algorithm that achieves O((cNlnN) 1/3 T2/3+√TlnN ) regret on T rounds of this game in the worst case. Furthermore, we present a number of refinements that treat arm- and time-dependent observation costs and achieve lower regret under benign conditions. We present lower bounds that show that, apart from the logarithmic factors, the worst-case regret bounds cannot be improved.

Gittins indices and constrained allocation in clinical trials

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We explore the use of Gittins indices to search for near optimality in sequential clinical trials. Some adaptive allocation rules are proposed to achieve the following two objectives as far as possible: (i) to reduce the expected successes lost, (ii) to minimize the error probability at the end. Simulation results indicate the merits of the rules based on Gittins indices for small trial sizes. The rules are generalized to the case when neither of the response densities is known. Asymptotic optimality is derived for the constrained rules. A simple allocation rule is recommended for one-stage models. The simulation results indicate that it works better than both equal allocation and Bather's randomized allocation. We conclude with a discussion of possible further developments.

Experiences with an Object-Oriented, Multi-Stage Language

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Metaphor is a multi-stage programming language extension to an imperative, object-oriented language in the style of C# or Java. This paper discusses some issues we faced when applying multi-stage language design concepts to an imperative base language and run-time environment. The issues range from dealing with pervasive references and open code to garbage collection and implementing cross-stage persistence.

A Multi-modal Biometric System for User Authentication

Relevância:

20.00% 20.00%

Publicador:

Multi-Modality in a New Key: The Significance of the Arts in Research & Education

Relevância:

20.00% 20.00%

Publicador:

A New Receiver Autonomous Integrity Monitoring Strategy for Multi-Fault Exclusion with GPS and Virtual Galileo Measurements

Relevância:

20.00% 20.00%

Publicador:

A Simulation Model for Mixed-Model Multi-Stage Assembly Lines

Relevância:

20.00% 20.00%

Publicador:

Prediction of Bead Width in the Arc Multi-pass Welding using BP Neural Network Model

Relevância:

20.00% 20.00%

Publicador:

Multi Criteria Maintenance Performance Measurement: A Conceptual Model

Relevância:

20.00% 20.00%

Publicador:

The Death of the Overreaction Anomaly: Multi-factor Explanations for Contrarian Returns

Relevância:

20.00% 20.00%

Publicador:

Factor Structure and Psychometric Properties of Frost Multi Dimensional Perfectionism Scale: An Investigation on the Basis of Australian Population

Relevância:

20.00% 20.00%

Publicador:

The NetWorkPlaceTM? Phenomenon: Applying a Multi-Disciplinary Approach to Research and Practice

Relevância:

20.00% 20.00%

Publicador:

Rough Set Model for Constraint-based Multi-dimensional Association Rule Mining

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
29
30
»