Biblioteca Digital

999 resultados para Infinite horizon economies

Necessary conditions of optimality for state constrained infinite horizon differential inclusions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents and discusses necessary conditions of optimality for infinite horizon dynamic optimization problems with inequality state constraints and set inclusion constraints at both endpoints of the trajectory. The cost functional depends on the state variable at the final time, and the dynamics are given by a differential inclusion. Moreover, the optimization is carried out over asymptotically convergent state trajectories. The novelty of the proposed optimality conditions for this class of problems is that the boundary condition of the adjoint variable is given as a weak directional inclusion at infinity. This improves on the currently available necessary conditions of optimality for infinite horizon problems. © 2011 IEEE.

A maximum principle for constrained infinite horizon dynamic control systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents and discusses a maximum principle for infinite horizon constrained optimal control problems with a cost functional depending on the state at the final time. The main feature of these optimality conditions is that, under reasonably weak assumptions, the multiplier is shown to satisfy a novel transversality condition at infinite time. It is also shown that these conditions can also be obtained for impulsive control problems whose dynamics are given by measure driven differential equations. © 2011 IFAC.

Collateral, default penalties and almost finite-time solvency

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We argue that it is possible to adapt the approach of imposing restrictions on available plans through finitely effective debt constraints, introduced by Levine and Zame (1996), to encompass models with default and collateral. Along this line, we introduce in the setting of Araujo, Páscoa and Torres-Martínez (2002) and Páscoa and Seghir (2008) the concept of almost finite-time solvency. We show that the conditions imposed in these two papers to rule out Ponzi schemes implicitly restrict actions to be almost finite-time solvent. We define the notion of equilibrium with almost finite-time solvency and look on sufficient conditions for its existence. Assuming a mild assumption on default penalties, namely that agents are myopic with respect to default penalties, we prove that existence is guaranteed (and Ponzi schemes are ruled out) when actions are restricted to be almost finite-time solvent. The proof is very simple and intuitive. In particular, the main existence results in Araujo et al. (2002) and Páscoa and Seghir (2008) are simple corollaries of our existence result.

Parametrized actor-critic algorithms for finite-horizon MDPs

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.

The Possibility of Ordering Infinite Utility Streams

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper revisits Diamond’s classical impossibility result regarding the ordering of infinite utility streams. We show that if no representability condition is imposed, there do exist strongly Paretian and finitely anonymous orderings of intertemporal utility streams with attractive additional properties. We extend a possibility theorem due to Svensson to a characterization theorem and we provide characterizations of all strongly Paretian and finitely anonymous rankings satisfying the strict transfer principle. In addition, infinite horizon extensions of leximin and of utilitarianism are characterized by adding an equity preference axiom and finite translation-scale measurability, respectively, to strong Pareto and finite anonymity.

A maximum principle for infinite time asymptotically stable impulsive dynamic control systems

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider an infinite horizon optimal impulsive control problems for which a given cost function is minimized by choosing control strategies driving the state to a point in a given closed set C ∞. We present necessary conditions of optimality in the form of a maximum principle for which the boundary condition of the adjoint variable is such that non-degeneracy due to the fact that the time horizon is infinite is ensured. These conditions are given for conventional systems in a first instance and then for impulsive control problems. They are proved by considering a family of approximating auxiliary interval conventional (without impulses) optimal control problems defined on an increasing sequence of finite time intervals. As far as we know, results of this kind have not been derived previously. © 2010 IFAC.

Visual predictive control of spiral motion

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper deals with constrained image-based visual servoing of circular and conical spiral motion about an unknown object approximating a single image point feature. Effective visual control of such trajectories has many applications for small unmanned aerial vehicles, including surveillance and inspection, forced landing (homing), and collision avoidance. A spherical camera model is used to derive a novel visual-predictive controller (VPC) using stability-based design methods for general nonlinear model-predictive control. In particular, a quasi-infinite horizon visual-predictive control scheme is derived. A terminal region, which is used as a constraint in the controller structure, can be used to guide appropriate reference image features for spiral tracking with respect to nominal stability and feasibility. Robustness properties are also discussed with respect to parameter uncertainty and additive noise. A comparison with competing visual-predictive control schemes is made, and some experimental results using a small quad rotor platform are given.

A Fair Contract for Managing Water Scarcity

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In public utilities, under supply constraints, fairness considerations lead to a market failure. This paper characterizes a two-period principal-agent contract for demand management, that mitigates this market failure in urban water systems. The contract is designed as an extensive form mechanism using subgame perfect Nash equilibrium (SPNE) as the solution concept. The contract is fair; and is shown to be economically efficient if, in case of deviation by the agent, the gain to the agent and the loss to the principal are small. It is shown that the assumption can be avoided in an infinite horizon contract.

Solving MDPs using two-timescale simulated annealing with multiplicative weights

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that proposed a method of solution of Finite-Horizon Markov Decision Processes (FH-MDPs). The extensions developed are in three directions: a) Use of the dynamic programming principle in the policy update step of SAMW b) A two-timescale actor-critic algorithm that uses simulated transitions alone, and c) Extending the algorithm to the infinite-horizon discounted-reward scenario. In particular, a) reduces the storage required from exponential to linear in the number of actions per stage-state pair. On the faster timescale, a 'critic' recursion performs policy evaluation while on the slower timescale an 'actor' recursion performs policy improvement using SAMW. We give a proof outlining convergence w.p. 1 and show experimental results on two settings: semiconductor fabrication and flow control in communication networks.

Solution of MDPS using simulation-based value iteration

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article proposes a three-timescale simulation based algorithm for solution of infinite horizon Markov Decision Processes (MDPs). We assume a finite state space and discounted cost criterion and adopt the value iteration approach. An approximation of the Dynamic Programming operator T is applied to the value function iterates. This 'approximate' operator is implemented using three timescales, the slowest of which updates the value function iterates. On the middle timescale we perform a gradient search over the feasible action set of each state using Simultaneous Perturbation Stochastic Approximation (SPSA) gradient estimates, thus finding the minimizing action in T. On the fastest timescale, the 'critic' estimates, over which the gradient search is performed, are obtained. A sketch of convergence explaining the dynamics of the algorithm using associated ODEs is also presented. Numerical experiments on rate based flow control on a bottleneck node using a continuous-time queueing model are performed using the proposed algorithm. The results obtained are verified against classical value iteration where the feasible set is suitably discretized. Over such a discretized setting, a variant of the algorithm of [12] is compared and the proposed algorithm is found to converge faster.

An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.

On the optimality of exhaustive service policies in multiclass queueing systems with modulated arrivals and switchovers

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Consider a single-server multiclass queueing system with K classes where the individual queues are fed by K-correlated interrupted Poisson streams generated in the states of a K-state stationary modulating Markov chain. The service times for all the classes are drawn independently from the same distribution. There is a setup time (and/or a setup cost) incurred whenever the server switches from one queue to another. It is required to minimize the sum of discounted inventory and setup costs over an infinite horizon. We provide sufficient conditions under which exhaustive service policies are optimal. We then present some simulation results for a two-class queueing system to show that exhaustive, threshold policies outperform non-exhaustive policies.

An Optimized SDE Model for Slotted Aloha

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider a stochastic differential equation (SDE) model of slotted Aloha with the retransmission probability as the associated parameter. We formulate the problem in both (a) the finite horizon and (b) the infinite horizon average cost settings. We apply the algorithm of 3] for the first setting, while for the second, we adapt a related algorithm from 2] that was originally developed in the simulation optimization framework. In the first setting, we obtain an optimal parameter trajectory that prescribes the parameter to use at any given instant while in the second setting, we obtain an optimal time-invariant parameter. Our algorithms are seen to exhibit good performance.

Bidding Dynamics of Rational Advertisers in Sponsored Search Auctions on the Web

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we address a key problem faced by advertisers in sponsored search auctions on the web: how much to bid, given the bids of the other advertisers, so as to maximize individual payoffs? Assuming the generalized second price auction as the auction mechanism, we formulate this problem in the framework of an infinite horizon alternative-move game of advertiser bidding behavior. For a sponsored search auction involving two advertisers, we characterize all the pure strategy and mixed strategy Nash equilibria. We also prove that the bid prices will lead to a Nash equilibrium, if the advertisers follow a myopic best response bidding strategy. Following this, we investigate the bidding behavior of the advertisers if they use Q-learning. We discover empirically an interesting trend that the Q-values converge even if both the advertisers learn simultaneously.

A Simultaneous Deterministic Perturbation Actor-Critic Algorithm with an Application to Optimal Mortgage Refinancing

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop a simulation-based, two-timescale actor-critic algorithm for infinite horizon Markov decision processes with finite state and action spaces, with a discounted reward criterion. The algorithm is of the gradient ascent type and performs a search in the space of stationary randomized policies. The algorithm uses certain simultaneous deterministic perturbation stochastic approximation (SDPSA) gradient estimates for enhanced performance. We show an application of our algorithm on a problem of mortgage refinancing. Our algorithm obtains the optimal refinancing strategies in a computationally efficient manner

«
1
2
3
4
5
6
7
8
...
66
67
»