3 resultados para MARKOV DECISION-PROCESSES
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MOP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional ""flat"" dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
We discuss the estimation of the expected value of the quality-adjusted survival, based on multistate models. We generalize an earlier work, considering the sojourn times in health states are not identically distributed, for a given vector of covariates. Approaches based on semiparametric and parametric (exponential and Weibull distributions) methodologies are considered. A simulation study is conducted to evaluate the performance of the proposed estimator and the jackknife resampling method is used to estimate the variance of such estimator. An application to a real data set is also included.
Resumo:
Consider a continuous-time Markov process with transition rates matrix Q in the state space Lambda boolean OR {0}. In In the associated Fleming-Viot process N particles evolve independently in A with transition rates matrix Q until one of them attempts to jump to state 0. At this moment the particle jumps to one of the positions of the other particles, chosen uniformly at random. When Lambda is finite, we show that the empirical distribution of the particles at a fixed time converges as N -> infinity to the distribution of a single particle at the same time conditioned on not touching {0}. Furthermore, the empirical profile of the unique invariant measure for the Fleming-Viot process with N particles converges as N -> infinity to the unique quasistationary distribution of the one-particle motion. A key element of the approach is to show that the two-particle correlations are of order 1/N.