981 resultados para Dynamic programming (DP)
Resumo:
The network revenue management (RM) problem arises in airline, hotel, media,and other industries where the sale products use multiple resources. It can be formulatedas a stochastic dynamic program but the dynamic program is computationallyintractable because of an exponentially large state space, and a number of heuristicshave been proposed to approximate it. Notable amongst these -both for their revenueperformance, as well as their theoretically sound basis- are approximate dynamic programmingmethods that approximate the value function by basis functions (both affinefunctions as well as piecewise-linear functions have been proposed for network RM)and decomposition methods that relax the constraints of the dynamic program to solvesimpler dynamic programs (such as the Lagrangian relaxation methods). In this paperwe show that these two seemingly distinct approaches coincide for the network RMdynamic program, i.e., the piecewise-linear approximation method and the Lagrangianrelaxation method are one and the same.
Resumo:
We obtain a recursive formulation for a general class of contractingproblems involving incentive constraints. Under these constraints,the corresponding maximization (sup) problems fails to have arecursive solution. Our approach consists of studying the Lagrangian.We show that, under standard assumptions, the solution to theLagrangian is characterized by a recursive saddle point (infsup)functional equation, analogous to Bellman's equation. Our approachapplies to a large class of contractual problems. As examples, westudy the optimal policy in a model with intertemporal participationconstraints (which arise in models of default) and intertemporalcompetitive constraints (which arise in Ramsey equilibria).
Resumo:
This paper presents and estimates a dynamic choice model in the attribute space considering rational consumers. In light of the evidence of several state-dependence patterns, the standard attribute-based model is extended by considering a general utility function where pure inertia and pure variety-seeking behaviors can be explained in the model as particular linear cases. The dynamics of the model are fully characterized by standard dynamic programming techniques. The model presents a stationary consumption pattern that can be inertial, where the consumer only buys one product, or a variety-seeking one, where the consumer shifts among varied products.We run some simulations to analyze the consumption paths out of the steady state. Underthe hybrid utility assumption, the consumer behaves inertially among the unfamiliar brandsfor several periods, eventually switching to a variety-seeking behavior when the stationary levels are approached. An empirical analysis is run using scanner databases for three different product categories: fabric softener, saltine cracker, and catsup. Non-linear specifications provide the best fit of the data, as hybrid functional forms are found in all the product categories for most attributes and segments. These results reveal the statistical superiority of the non-linear structure and confirm the gradual trend to seek variety as the level of familiarity with the purchased items increases.
Resumo:
The paper develops a method to solve higher-dimensional stochasticcontrol problems in continuous time. A finite difference typeapproximation scheme is used on a coarse grid of low discrepancypoints, while the value function at intermediate points is obtainedby regression. The stability properties of the method are discussed,and applications are given to test problems of up to 10 dimensions.Accurate solutions to these problems can be obtained on a personalcomputer.
Resumo:
Customer choice behavior, such as 'buy-up' and 'buy-down', is an importantphe-nomenon in a wide range of industries. Yet there are few models ormethodologies available to exploit this phenomenon within yield managementsystems. We make some progress on filling this void. Specifically, wedevelop a model of yield management in which the buyers' behavior ismodeled explicitly using a multi-nomial logit model of demand. Thecontrol problem is to decide which subset of fare classes to offer ateach point in time. The set of open fare classes then affects the purchaseprobabilities for each class. We formulate a dynamic program todetermine the optimal control policy and show that it reduces to a dynamicnested allocation policy. Thus, the optimal choice-based policy caneasily be implemented in reservation systems that use nested allocationcontrols. We also develop an estimation procedure for our model based onthe expectation-maximization (EM) method that jointly estimates arrivalrates and choice model parameters when no-purchase outcomes areunobservable. Numerical results show that this combined optimization-estimation approach may significantly improve revenue performancerelative to traditional leg-based models that do not account for choicebehavior.
Resumo:
The method of stochastic dynamic programming is widely used in ecology of behavior, but has some imperfections because of use of temporal limits. The authors presented an alternative approach based on the methods of the theory of restoration. Suggested method uses cumulative energy reserves per time unit as a criterium, that leads to stationary cycles in the area of states. This approach allows to study the optimal feeding by analytic methods.
Resumo:
This paper derives the HJB (Hamilton-Jacobi-Bellman) equation for sophisticated agents in a finite horizon dynamic optimization problem with non-constant discounting in a continuous setting, by using a dynamic programming approach. A simple example is used in order to illustrate the applicability of this HJB equation, by suggesting a method for constructing the subgame perfect equilibrium solution to the problem.Conditions for the observational equivalence with an associated problem with constantdiscounting are analyzed. Special attention is paid to the case of free terminal time. Strotz¿s model (an eating cake problem of a nonrenewable resource with non-constant discounting) is revisited.
Resumo:
In the analysis of equilibrium policies in a di erential game, if agents have different time preference rates, the cooperative (Pareto optimum) solution obtained by applying the Pontryagin's Maximum Principle becomes time inconsistent. In this work we derive a set of dynamic programming equations (in discrete and continuous time) whose solutions are time consistent equilibrium rules for N-player cooperative di erential games in which agents di er in their instantaneous utility functions and also in their discount rates of time preference. The results are applied to the study of a cake-eating problem describing the management of a common property exhaustible natural resource. The extension of the results to a simple common property renewable natural resource model in in nite horizon is also discussed.
Resumo:
[cat] En aquest treball s'analitza un model estocàstic en temps continu en el que l'agent decisor descompta les utilitats instantànies i la funció final amb taxes de preferència temporal constants però diferents. En aquest context es poden modelitzar problemes en els quals, quan el temps s'acosta al moment final, la valoració de la funció final incrementa en comparació amb les utilitats instantànies. Aquest tipus d'asimetria no es pot descriure ni amb un descompte estàndard ni amb un variable. Per tal d'obtenir solucions consistents temporalment es deriva l'equació de programació dinàmica estocàstica, les solucions de la qual són equilibris Markovians. Per a aquest tipus de preferències temporals, s'estudia el model clàssic de consum i inversió (Merton, 1971) per a les funcions d'utilitat del tipus CRRA i CARA, comparant els equilibris Markovians amb les solucions inconsistents temporalment. Finalment es discuteix la introducció del temps final aleatori.
Resumo:
[cat] En aquest article, es presenta un model econòmic que permet determinar la venda o no d'una pòlissa de vida (total o en part) per part d'un assegurat malalt terminal en el mercat dels viatical settlements. Aquest mercat va aparèixer a finals de la dècada dels 80 a conseqüència de l'epidèmia de la SIDA. Actualment, representa una part del mercat dels life settlements. Les pòlisses que es comercialitzen en el mercat dels viaticals són aquelles on l'assegurat és malalt terminal amb una esperança de vida de dos anys o menys. El model és discret i considera només dos períodes (anys), ja que aquesta és la vida residual màxima que contempla el mercat. L'agent posseix una riquesa inicial que ha de repartir entre consum i herència. S'introdueix en primer lloc la funció d'utilitat esperada del decisor i, utilitzant programació dinàmica, es dedueix l'estratègia que reporta una utilitat més gran (no vendre/vendre (en part) la pòlissa en el moment zero/vendre (en part) la pòlissa en el moment ú). L'òptim depèn del preu de la pòlissa venuda i de paràmetres personals de l'individu. Es troba una expressió analítica per l'estratègia òptima i es realitza un anàlisi de sensibilitat.
Resumo:
This paper derives the HJB (Hamilton-Jacobi-Bellman) equation for sophisticated agents in a finite horizon dynamic optimization problem with non-constant discounting in a continuous setting, by using a dynamic programming approach. A simple example is used in order to illustrate the applicability of this HJB equation, by suggesting a method for constructing the subgame perfect equilibrium solution to the problem.Conditions for the observational equivalence with an associated problem with constantdiscounting are analyzed. Special attention is paid to the case of free terminal time. Strotz¿s model (an eating cake problem of a nonrenewable resource with non-constant discounting) is revisited.
Resumo:
In the analysis of equilibrium policies in a di erential game, if agents have different time preference rates, the cooperative (Pareto optimum) solution obtained by applying the Pontryagin's Maximum Principle becomes time inconsistent. In this work we derive a set of dynamic programming equations (in discrete and continuous time) whose solutions are time consistent equilibrium rules for N-player cooperative di erential games in which agents di er in their instantaneous utility functions and also in their discount rates of time preference. The results are applied to the study of a cake-eating problem describing the management of a common property exhaustible natural resource. The extension of the results to a simple common property renewable natural resource model in in nite horizon is also discussed.
Resumo:
[cat] En aquest treball s'analitza un model estocàstic en temps continu en el que l'agent decisor descompta les utilitats instantànies i la funció final amb taxes de preferència temporal constants però diferents. En aquest context es poden modelitzar problemes en els quals, quan el temps s'acosta al moment final, la valoració de la funció final incrementa en comparació amb les utilitats instantànies. Aquest tipus d'asimetria no es pot descriure ni amb un descompte estàndard ni amb un variable. Per tal d'obtenir solucions consistents temporalment es deriva l'equació de programació dinàmica estocàstica, les solucions de la qual són equilibris Markovians. Per a aquest tipus de preferències temporals, s'estudia el model clàssic de consum i inversió (Merton, 1971) per a les funcions d'utilitat del tipus CRRA i CARA, comparant els equilibris Markovians amb les solucions inconsistents temporalment. Finalment es discuteix la introducció del temps final aleatori.
Resumo:
[cat] En aquest article, es presenta un model econòmic que permet determinar la venda o no d'una pòlissa de vida (total o en part) per part d'un assegurat malalt terminal en el mercat dels viatical settlements. Aquest mercat va aparèixer a finals de la dècada dels 80 a conseqüència de l'epidèmia de la SIDA. Actualment, representa una part del mercat dels life settlements. Les pòlisses que es comercialitzen en el mercat dels viaticals són aquelles on l'assegurat és malalt terminal amb una esperança de vida de dos anys o menys. El model és discret i considera només dos períodes (anys), ja que aquesta és la vida residual màxima que contempla el mercat. L'agent posseix una riquesa inicial que ha de repartir entre consum i herència. S'introdueix en primer lloc la funció d'utilitat esperada del decisor i, utilitzant programació dinàmica, es dedueix l'estratègia que reporta una utilitat més gran (no vendre/vendre (en part) la pòlissa en el moment zero/vendre (en part) la pòlissa en el moment ú). L'òptim depèn del preu de la pòlissa venuda i de paràmetres personals de l'individu. Es troba una expressió analítica per l'estratègia òptima i es realitza un anàlisi de sensibilitat.
Resumo:
We present a framework for modeling right-hand gestures in bowed-string instrument playing, applied to violin. Nearly non-intrusive sensing techniques allow for accurate acquisition of relevant timbre-related bowing gesture parameter cues. We model the temporal contour of bow transversal velocity, bow pressing force, and bow-bridge distance as sequences of short segments, in particular B´ezier cubic curve segments. Considering different articulations, dynamics, andcontexts, a number of note classes is defined. Gesture parameter contours of a performance database are analyzed at note-level by following a predefined grammar that dictatescharacteristics of curve segment sequences for each of the classes into consideration. Based on dynamic programming, gesture parameter contour analysis provides an optimal curve parameter vector for each note. The informationpresent in such parameter vector is enough for reconstructing original gesture parameter contours with significant fidelity. From the resulting representation vectors, weconstruct a statistical model based on Gaussian mixtures, suitable for both analysis and synthesis of bowing gesture parameter contours. We show the potential of the modelby synthesizing bowing gesture parameter contours from an annotated input score. Finally, we point out promising applicationsand developments.