116 resultados para markov processes
Resumo:
This paper studies:(i)the long-time behaviour of the empirical distribution of age and normalized position of an age-dependent critical branching Markov process conditioned on non-extinction;and (ii) the super-process limit of a sequence of age-dependent critical branching Brownian motions.
Resumo:
This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation.
Resumo:
This paper studies the long-time behavior of the empirical distribution of age and normalized position of an age-dependent supercritical branching Markov process. The motion of each individual during its life is a random function of its age. It is shown that the empirical distribution of the age and the normalized position of all individuals alive at time t converges as t -> infinity to a deterministic product measure.
Resumo:
We study the distribution of residence time or equivalently that of "mean magnetization" for a family of Gaussian Markov processes indexed by a positive parameter alpha. The persistence exponent for these processes is simply given by theta=alpha but the residence time distribution is nontrivial. The shape of this distribution undergoes a qualitative change as theta increases, indicating a sharp change in the ergodic properties of the process. We develop two alternate methods to calculate exactly but recursively the moments of the distribution for arbitrary alpha. For some special values of alpha, we obtain closed form expressions of the distribution function. [S1063-651X(99)03306-1].
Resumo:
We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases.
Resumo:
In this article, we study risk-sensitive control problem with controlled continuous time Markov chain state dynamics. Using multiplicative dynamic programming principle along with the atomic structure of the state dynamics, we prove the existence and a characterization of optimal risk-sensitive control under geometric ergodicity of the state dynamics along with a smallness condition on the running cost.
Resumo:
Let a and s denote the inter arrival times and service times in a GI/GI/1 queue. Let a (n), s (n) be the r.v.s, with distributions as the estimated distributions of a and s from iid samples of a and s of sizes n. Let w be a r.v. with the stationary distribution lr of the waiting times of the queue with input (a, s). We consider the problem of estimating E [w~], tx > 0 and 7r via simulations when (a (n), s (n)) are used as input. Conditions for the accuracy of the asymptotic estimate, continuity of the asymptotic variance and uniformity in the rate of convergence to the estimate are obtained. We also obtain rates of convergence for sample moments, the empirical process and the quantile process for the regenerative processes. Robust estimates are also obtained when an outlier contaminated sample of a and s is provided. In the process we obtain consistency, continuity and asymptotic normality of M-estimators for stationary sequences. Some robustness results for Markov processes are included.
Resumo:
The probability distribution of the eigenvalues of a second-order stochastic boundary value problem is considered. The solution is characterized in terms of the zeros of an associated initial value problem. It is further shown that the probability distribution is related to the solution of a first-order nonlinear stochastic differential equation. Solutions of this equation based on the theory of Markov processes and also on the closure approximation are presented. A string with stochastic mass distribution is considered as an example for numerical work. The theoretical probability distribution functions are compared with digital simulation results. The comparison is found to be reasonably good.
Resumo:
This splitting techniques for MARKOV chains developed by NUMMELIN (1978a) and ATHREYA and NEY (1978b) are used to derive an imbedded renewal process in WOLD's point process with MARKOV-correlated intervals. This leads to a simple proof of renewal theorems for such processes. In particular, a key renewal theorem is proved, from which analogues to both BLACKWELL's and BREIMAN's forms of the renewal theorem can be deduced.
Resumo:
We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The existence of an optimal feedback law is established for the risk-sensitive optimal control problem with denumerable state space. The main assumptions imposed are irreducibility and a near monotonicity condition on the one-step cost function. A solution can be found constructively using either value iteration or policy iteration under suitable conditions on initial feedback law.
Resumo:
We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.
Resumo:
We develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process (MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multi-stage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.