990 resultados para Infinite horizon problems
Resumo:
In my PhD thesis I propose a Bayesian nonparametric estimation method for structural econometric models where the functional parameter of interest describes the economic agent's behavior. The structural parameter is characterized as the solution of a functional equation, or by using more technical words, as the solution of an inverse problem that can be either ill-posed or well-posed. From a Bayesian point of view, the parameter of interest is a random function and the solution to the inference problem is the posterior distribution of this parameter. A regular version of the posterior distribution in functional spaces is characterized. However, the infinite dimension of the considered spaces causes a problem of non continuity of the solution and then a problem of inconsistency, from a frequentist point of view, of the posterior distribution (i.e. problem of ill-posedness). The contribution of this essay is to propose new methods to deal with this problem of ill-posedness. The first one consists in adopting a Tikhonov regularization scheme in the construction of the posterior distribution so that I end up with a new object that I call regularized posterior distribution and that I guess it is solution of the inverse problem. The second approach consists in specifying a prior distribution on the parameter of interest of the g-prior type. Then, I detect a class of models for which the prior distribution is able to correct for the ill-posedness also in infinite dimensional problems. I study asymptotic properties of these proposed solutions and I prove that, under some regularity condition satisfied by the true value of the parameter of interest, they are consistent in a "frequentist" sense. Once I have set the general theory, I apply my bayesian nonparametric methodology to different estimation problems. First, I apply this estimator to deconvolution and to hazard rate, density and regression estimation. Then, I consider the estimation of an Instrumental Regression that is useful in micro-econometrics when we have to deal with problems of endogeneity. Finally, I develop an application in finance: I get the bayesian estimator for the equilibrium asset pricing functional by using the Euler equation defined in the Lucas'(1978) tree-type models.
Resumo:
Our main goal is to compute or estimate the calmness modulus of the argmin mapping of linear semi-infinite optimization problems under canonical perturbations, i.e., perturbations of the objective function together with continuous perturbations of the right-hand side of the constraint system (with respect to an index ranging in a compact Hausdorff space). Specifically, we provide a lower bound on the calmness modulus for semi-infinite programs with unique optimal solution which turns out to be the exact modulus when the problem is finitely constrained. The relationship between the calmness of the argmin mapping and the same property for the (sub)level set mapping (with respect to the objective function), for semi-infinite programs and without requiring the uniqueness of the nominal solution, is explored, too, providing an upper bound on the calmness modulus of the argmin mapping. When confined to finitely constrained problems, we also provide a computable upper bound as it only relies on the nominal data and parameters, not involving elements in a neighborhood. Illustrative examples are provided.
Resumo:
The Remez penalty and smoothing algorithm (RPSALG) is a unified framework for penalty and smoothing methods for solving min-max convex semi-infinite programing problems, whose convergence was analyzed in a previous paper of three of the authors. In this paper we consider a partial implementation of RPSALG for solving ordinary convex semi-infinite programming problems. Each iteration of RPSALG involves two types of auxiliary optimization problems: the first one consists of obtaining an approximate solution of some discretized convex problem, while the second one requires to solve a non-convex optimization problem involving the parametric constraints as objective function with the parameter as variable. In this paper we tackle the latter problem with a variant of the cutting angle method called ECAM, a global optimization procedure for solving Lipschitz programming problems. We implement different variants of RPSALG which are compared with the unique publicly available SIP solver, NSIPS, on a battery of test problems.
Resumo:
Given a convex optimization problem (P) in a locally convex topological vector space X with an arbitrary number of constraints, we consider three possible dual problems of (P), namely, the usual Lagrangian dual (D), the perturbational dual (Q), and the surrogate dual (Δ), the last one recently introduced in a previous paper of the authors (Goberna et al., J Convex Anal 21(4), 2014). As shown by simple examples, these dual problems may be all different. This paper provides conditions ensuring that inf(P)=max(D), inf(P)=max(Q), and inf(P)=max(Δ) (dual equality and existence of dual optimal solutions) in terms of the so-called closedness regarding to a set. Sufficient conditions guaranteeing min(P)=sup(Q) (dual equality and existence of primal optimal solutions) are also provided, for the nominal problems and also for their perturbational relatives. The particular cases of convex semi-infinite optimization problems (in which either the number of constraints or the dimension of X, but not both, is finite) and linear infinite optimization problems are analyzed. Finally, some applications to the feasibility of convex inequality systems are described.
Resumo:
Convex vector (or multi-objective) semi-infinite optimization deals with the simultaneous minimization of finitely many convex scalar functions subject to infinitely many convex constraints. This paper provides characterizations of the weakly efficient, efficient and properly efficient points in terms of cones involving the data and Karush–Kuhn–Tucker conditions. The latter characterizations rely on different local and global constraint qualifications. The results in this paper generalize those obtained by the same authors on linear vector semi-infinite optimization problems.
Resumo:
Agency costs are said to arise as a result of the separation of ownership from control inherent in the corporate form of ownership. One such agency problem concerns the potential variance between the time horizons of principal shareholders and agent managers. Agency theory suggests that these costs can be alleviated or controlled through performance-based Chief Executive Officer (CEO) contracting. However, components of a CEO's compensation contract can exacerbate or mitigate agency-related problems (Antle and Smith, 1985). According to the horizon hypothesis, a self-serving CEO reduces discretionary research and development (R&D) expenditures to increase earnings and earnings-based bonus compensation. Agency theorists contend that a CEO's market-based compensation component can mitigate horizon problems. This study seeks to determine whether there is a relationship between CEO earnings- and market-based compensation components and R&D expenditures in the largest United States industrial firms from 1987 to 1993.^ Consistent with the horizon hypothesis, results provide evidence of a negative and statistically significant relationship between CEO cash compensation (i.e., salary and bonus) and the firm's R&D expenditures. Consistent with the expectations of agency theory, results provide evidence of a positive and statistically significant relationship between market-based CEO compensation and R&D.^ Further results of this study provide evidence of a positive and statistically significant relationship between CEO tenure and the firm's R&D expenditures. Although there is a negative relationship between CEO age and the firm's R&D, it was not statistically significant at the 0.5 level. ^
Resumo:
This paper deals with constrained image-based visual servoing of circular and conical spiral motion about an unknown object approximating a single image point feature. Effective visual control of such trajectories has many applications for small unmanned aerial vehicles, including surveillance and inspection, forced landing (homing), and collision avoidance. A spherical camera model is used to derive a novel visual-predictive controller (VPC) using stability-based design methods for general nonlinear model-predictive control. In particular, a quasi-infinite horizon visual-predictive control scheme is derived. A terminal region, which is used as a constraint in the controller structure, can be used to guide appropriate reference image features for spiral tracking with respect to nominal stability and feasibility. Robustness properties are also discussed with respect to parameter uncertainty and additive noise. A comparison with competing visual-predictive control schemes is made, and some experimental results using a small quad rotor platform are given.
Resumo:
In public utilities, under supply constraints, fairness considerations lead to a market failure. This paper characterizes a two-period principal-agent contract for demand management, that mitigates this market failure in urban water systems. The contract is designed as an extensive form mechanism using subgame perfect Nash equilibrium (SPNE) as the solution concept. The contract is fair; and is shown to be economically efficient if, in case of deviation by the agent, the gain to the agent and the loss to the principal are small. It is shown that the assumption can be avoided in an infinite horizon contract.
Resumo:
We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that proposed a method of solution of Finite-Horizon Markov Decision Processes (FH-MDPs). The extensions developed are in three directions: a) Use of the dynamic programming principle in the policy update step of SAMW b) A two-timescale actor-critic algorithm that uses simulated transitions alone, and c) Extending the algorithm to the infinite-horizon discounted-reward scenario. In particular, a) reduces the storage required from exponential to linear in the number of actions per stage-state pair. On the faster timescale, a 'critic' recursion performs policy evaluation while on the slower timescale an 'actor' recursion performs policy improvement using SAMW. We give a proof outlining convergence w.p. 1 and show experimental results on two settings: semiconductor fabrication and flow control in communication networks.
Resumo:
This article proposes a three-timescale simulation based algorithm for solution of infinite horizon Markov Decision Processes (MDPs). We assume a finite state space and discounted cost criterion and adopt the value iteration approach. An approximation of the Dynamic Programming operator T is applied to the value function iterates. This 'approximate' operator is implemented using three timescales, the slowest of which updates the value function iterates. On the middle timescale we perform a gradient search over the feasible action set of each state using Simultaneous Perturbation Stochastic Approximation (SPSA) gradient estimates, thus finding the minimizing action in T. On the fastest timescale, the 'critic' estimates, over which the gradient search is performed, are obtained. A sketch of convergence explaining the dynamics of the algorithm using associated ODEs is also presented. Numerical experiments on rate based flow control on a bottleneck node using a continuous-time queueing model are performed using the proposed algorithm. The results obtained are verified against classical value iteration where the feasible set is suitably discretized. Over such a discretized setting, a variant of the algorithm of [12] is compared and the proposed algorithm is found to converge faster.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Consider a single-server multiclass queueing system with K classes where the individual queues are fed by K-correlated interrupted Poisson streams generated in the states of a K-state stationary modulating Markov chain. The service times for all the classes are drawn independently from the same distribution. There is a setup time (and/or a setup cost) incurred whenever the server switches from one queue to another. It is required to minimize the sum of discounted inventory and setup costs over an infinite horizon. We provide sufficient conditions under which exhaustive service policies are optimal. We then present some simulation results for a two-class queueing system to show that exhaustive, threshold policies outperform non-exhaustive policies.
Resumo:
We consider a stochastic differential equation (SDE) model of slotted Aloha with the retransmission probability as the associated parameter. We formulate the problem in both (a) the finite horizon and (b) the infinite horizon average cost settings. We apply the algorithm of 3] for the first setting, while for the second, we adapt a related algorithm from 2] that was originally developed in the simulation optimization framework. In the first setting, we obtain an optimal parameter trajectory that prescribes the parameter to use at any given instant while in the second setting, we obtain an optimal time-invariant parameter. Our algorithms are seen to exhibit good performance.
Resumo:
In this paper, we address a key problem faced by advertisers in sponsored search auctions on the web: how much to bid, given the bids of the other advertisers, so as to maximize individual payoffs? Assuming the generalized second price auction as the auction mechanism, we formulate this problem in the framework of an infinite horizon alternative-move game of advertiser bidding behavior. For a sponsored search auction involving two advertisers, we characterize all the pure strategy and mixed strategy Nash equilibria. We also prove that the bid prices will lead to a Nash equilibrium, if the advertisers follow a myopic best response bidding strategy. Following this, we investigate the bidding behavior of the advertisers if they use Q-learning. We discover empirically an interesting trend that the Q-values converge even if both the advertisers learn simultaneously.
Resumo:
We develop a simulation-based, two-timescale actor-critic algorithm for infinite horizon Markov decision processes with finite state and action spaces, with a discounted reward criterion. The algorithm is of the gradient ascent type and performs a search in the space of stationary randomized policies. The algorithm uses certain simultaneous deterministic perturbation stochastic approximation (SDPSA) gradient estimates for enhanced performance. We show an application of our algorithm on a problem of mortgage refinancing. Our algorithm obtains the optimal refinancing strategies in a computationally efficient manner