Biblioteca Digital

878 resultados para semi-Markov decision process

Portfolio Optimization in a Semi-Markov Modulated Market

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address a portfolio optimization problem in a semi-Markov modulated market. We study both the terminal expected utility optimization on finite time horizon and the risk-sensitive portfolio optimization on finite and infinite time horizon. We obtain optimal portfolios in relevant cases. A numerical procedure is also developed to compute the optimal expected terminal utility for finite horizon problem.

A reinforcement learning based algorithm for finite horizon Markov decision processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Limit Theorems for Semi-Markov Processes and Renewal Theory for Markov Chains

Relevância:

100.00% 100.00%

Publicador:

An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.

Risk-sensitive optimal control for Markov decision processes with monotone cost

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The existence of an optimal feedback law is established for the risk-sensitive optimal control problem with denumerable state space. The main assumptions imposed are irreducibility and a near monotonicity condition on the one-step cost function. A solution can be found constructively using either value iteration or policy iteration under suitable conditions on initial feedback law.

An Actor-Critic Algorithm for Finite Horizon Markov Decision Processes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Transmit power Control with ARQ in energy harvesting sensors: a decision-theoretic approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the problem of finding optimal power control policies for wireless energy harvesting sensor (EHS) nodes with automatic repeat request (ARQ)-based packet transmissions. The EHS harvests energy from the environment according to a Bernoulli process; and it is required to operate within the constraint of energy neutrality. The EHS obtains partial channel state information (CSI) at the transmitter through the link-layer ARQ protocol, via the ACK/NACK feedback messages, and uses it to adapt the transmission power for the packet (re)transmission attempts. The underlying wireless fading channel is modeled as a finite state Markov chain with known transition probabilities. Thus, the goal of the power management policy is to determine the best power setting for the current packet transmission attempt, so as to maximize a long-run expected reward such as the expected outage probability. The problem is addressed in a decision-theoretic framework by casting it as a partially observable Markov decision process (POMDP). Due to the large size of the state-space, the exact solution to the POMDP is computationally expensive. Hence, two popular approximate solutions are considered, which yield good power management policies for the transmission attempts. Monte Carlo simulation results illustrate the efficacy of the approach and show that the approximate solutions significantly outperform conventional approaches.

Stable Markov decision processes using simulation based predictive control

Relevância:

100.00% 100.00%

Publicador:

Partially observable Markov decision processes with continuous observations for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Factored partially observable Markov decision processes for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Partially observable Markov decision processes with continuous observations for dialogue management

Relevância:

100.00% 100.00%

Publicador:

Secondary structure prediction using Sigmoid belief networks to parameterize segmental semi-Markov models

Relevância:

100.00% 100.00%

Publicador:

Two time-scale stochastic approximation for constrained stochastic optimization and constrained Markov decision problems

Relevância:

100.00% 100.00%

Publicador:

Dialog Systems based on Markov decision processes over two real tasks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work the state of the art of the automatic dialogue strategy management using Markov decision processes (MDP) with reinforcement learning (RL) is described. Partially observable Markov decision processes (POMDP) are also described. To test the validity of these methods, two spoken dialogue systems have been developed. The first one is a spoken dialogue system for weather forecast providing, and the second one is a more complex system for train information. With the first system, comparisons between a rule-based system and an automatically trained system have been done, using a real corpus to train the automatic strategy. In the second system, the scalability of these methods when used in larger systems has been tested.

Hidden Markov decision trees

Relevância:

100.00% 100.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
58
59
»