980 resultados para dynamic programming decomposition
Resumo:
Opaque products enable service providers to hide specific characteristics of their service fulfillment from the customer until after purchase. Prominent examples include internet-based service providers selling airline tickets without defining details, such as departure time or operating airline, until the booking has been made. Owing to the resulting flexibility in resource utilization, the traditional revenue management process needs to be modified. In this paper, we extend dynamic programming decomposition techniques widely used for traditional revenue management to develop an intuitive capacity control approach that allows for the incorporation of opaque products. In a simulation study, we show that the developed approach significantly outperforms other well-known capacity control approaches adapted to the opaque product setting. Based on the approach, we also provide computational examples of how the share of opaque products as well as the degree of opacity can influence the results.
Resumo:
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.
Resumo:
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.
Resumo:
This paper addresses the non-preemptive single machine scheduling problem to minimize total tardiness. We are interested in the online version of this problem, where orders arrive at the system at random times. Jobs have to be scheduled without knowledge of what jobs will come afterwards. The processing times and the due dates become known when the order is placed. The order release date occurs only at the beginning of periodic intervals. A customized approximate dynamic programming method is introduced for this problem. The authors also present numerical experiments that assess the reliability of the new approach and show that it performs better than a myopic policy.
Resumo:
1. Establishing biological control agents in the field is a major step in any classical biocontrol programme, yet there are few general guidelines to help the practitioner decide what factors might enhance the establishment of such agents. 2. A stochastic dynamic programming (SDP) approach, linked to a metapopulation model, was used to find optimal release strategies (number and size of releases), given constraints on time and the number of biocontrol agents available. By modelling within a decision-making framework we derived rules of thumb that will enable biocontrol workers to choose between management options, depending on the current state of the system. 3. When there are few well-established sites, making a few large releases is the optimal strategy. For other states of the system, the optimal strategy ranges from a few large releases, through a mixed strategy (a variety of release sizes), to many small releases, as the probability of establishment of smaller inocula increases. 4. Given that the probability of establishment is rarely a known entity, we also strongly recommend a mixed strategy in the early stages of a release programme, to accelerate learning and improve the chances of finding the optimal approach.
Resumo:
1. A model of the population dynamics of Banksia ornata was developed, using stochastic dynamic programming (a state-dependent decision-making tool), to determine optimal fire management strategies that incorporate trade-offs between biodiversity conservation and fuel reduction. 2. The modelled population of B. ornata was described by its age and density, and was exposed to the risk of unplanned fires and stochastic variation in germination success. 3. For a given population in each year, three management strategies were considered: (i) lighting a prescribed fire; (ii) controlling the incidence of unplanned fire; (iii) doing nothing. 4. The optimal management strategy depended on the state of the B. ornata population, with the time since the last fire (age of the population) being the most important variable. Lighting a prescribed fire at an age of less than 30 years was only optimal when the density of seedlings after a fire was low (< 100 plants ha(-1)) or when there were benefits of maintaining a low fuel load by using more frequent fire. 5. Because the cost of management was assumed to be negligible (relative to the value of the persistence of the population), the do-nothing option was never the optimal strategy, although lighting prescribed fires had only marginal benefits when the mean interval between unplanned fires was less than 20-30 years.
Resumo:
A decision theory framework can be a powerful technique to derive optimal management decisions for endangered species. We built a spatially realistic stochastic metapopulation model for the Mount Lofty Ranges Southern Emu-wren (Stipiturus malachurus intermedius), a critically endangered Australian bird. Using diserete-time Markov,chains to describe the dynamics of a metapopulation and stochastic dynamic programming (SDP) to find optimal solutions, we evaluated the following different management decisions: enlarging existing patches, linking patches via corridors, and creating a new patch. This is the first application of SDP to optimal landscape reconstruction and one of the few times that landscape reconstruction dynamics have been integrated with population dynamics. SDP is a powerful tool that has advantages over standard Monte Carlo simulation methods because it can give the exact optimal strategy for every landscape configuration (combination of patch areas and presence of corridors) and pattern of metapopulation occupancy, as well as a trajectory of strategies. It is useful when a sequence of management actions can be performed over a given time horizon, as is the case for many endangered species recovery programs, where only fixed amounts of resources are available in each time step. However, it is generally limited by computational constraints to rather small networks of patches. The model shows that optimal metapopulation, management decisions depend greatly on the current state of the metapopulation,. and there is no strategy that is universally the best. The extinction probability over 30 yr for the optimal state-dependent management actions is 50-80% better than no management, whereas the best fixed state-independent sets of strategies are only 30% better than no management. This highlights the advantages of using a decision theory tool to investigate conservation strategies for metapopulations. It is clear from these results that the sequence of management actions is critical, and this can only be effectively derived from stochastic dynamic programming. The model illustrates the underlying difficulty in determining simple rules of thumb for the sequence of management actions for a metapopulation. This use of a decision theory framework extends the capacity of population viability analysis (PVA) to manage threatened species.
Resumo:
We consider an optimal control problem with a deterministic finite horizon and state variable dynamics given by a Markov-switching jump–diffusion stochastic differential equation. Our main results extend the dynamic programming technique to this larger family of stochastic optimal control problems. More specifically, we provide a detailed proof of Bellman’s optimality principle (or dynamic programming principle) and obtain the corresponding Hamilton–Jacobi–Belman equation, which turns out to be a partial integro-differential equation due to the extra terms arising from the Lévy process and the Markov process. As an application of our results, we study a finite horizon consumption– investment problem for a jump–diffusion financial market consisting of one risk-free asset and one risky asset whose coefficients are assumed to depend on the state of a continuous time finite state Markov process. We provide a detailed study of the optimal strategies for this problem, for the economically relevant families of power utilities and logarithmic utilities.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
The aim of this thesis is to price options on equity index futures with an application to standard options on S&P 500 futures traded on the Chicago Mercantile Exchange. Our methodology is based on stochastic dynamic programming, which can accommodate European as well as American options. The model accommodates dividends from the underlying asset. It also captures the optimal exercise strategy and the fair value of the option. This approach is an alternative to available numerical pricing methods such as binomial trees, finite differences, and ad-hoc numerical approximation techniques. Our numerical and empirical investigations demonstrate convergence, robustness, and efficiency. We use this methodology to value exchange-listed options. The European option premiums thus obtained are compared to Black's closed-form formula. They are accurate to four digits. The American option premiums also have a similar level of accuracy compared to premiums obtained using finite differences and binomial trees with a large number of time steps. The proposed model accounts for deterministic, seasonally varying dividend yield. In pricing futures options, we discover that what matters is the sum of the dividend yields over the life of the futures contract and not their distribution.
Resumo:
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.
Resumo:
The purpose of this expository arti le is to present a self- ontained overview of some results on the hara terization of the optimal value fun tion of a sto hasti target problem as (dis ontinuous) vis osity solution of a ertain dynami programming PDE and its appli ation to the problem of hedging ontingent laims in the presen e of portfolio onstraints and large investors
Resumo:
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.
Resumo:
We investigate several two-dimensional guillotine cutting stock problems and their variants in which orthogonal rotations are allowed. We first present two dynamic programming based algorithms for the Rectangular Knapsack (RK) problem and its variants in which the patterns must be staged. The first algorithm solves the recurrence formula proposed by Beasley; the second algorithm - for staged patterns - also uses a recurrence formula. We show that if the items are not so small compared to the dimensions of the bin, then these algorithms require polynomial time. Using these algorithms we solved all instances of the RK problem found at the OR-LIBRARY, including one for which no optimal solution was known. We also consider the Two-dimensional Cutting Stock problem. We present a column generation based algorithm for this problem that uses the first algorithm above mentioned to generate the columns. We propose two strategies to tackle the residual instances. We also investigate a variant of this problem where the bins have different sizes. At last, we study the Two-dimensional Strip Packing problem. We also present a column generation based algorithm for this problem that uses the second algorithm above mentioned where staged patterns are imposed. In this case we solve instances for two-, three- and four-staged patterns. We report on some computational experiments with the various algorithms we propose in this paper. The results indicate that these algorithms seem to be suitable for solving real-world instances. We give a detailed description (a pseudo-code) of all the algorithms presented here, so that the reader may easily implement these algorithms. (c) 2007 Elsevier B.V. All rights reserved.