962 resultados para approximate dynamic programming


Relevância:

90.00% 90.00%

Publicador:

Resumo:

I study long-term financial contracts between lenders and borrowers in the absence of perfect enforceability and when both parties are credit constrained. Borrowers repeatedly have projects to undertake and need external financing. Lenders can commit to contractual agreements whereas borrowers can renege any period. I show that equilibrium contracts feature interesting dynamics: the economy exhibits efficient investment cycles; absence of perfect enforcement and shortage of capital skew the cycles toward states of liquidity drought; credit is rationed if either the lender has too little capital or if the borrower has too little collateral. This paper's technical contribution is its demonstration of the existence and characterization of financial contracts that are solutions to a non-convex dynamic programming problem.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Cette thèse est divisée en deux grands chapitres, dont le premier porte sur des problèmes de commande optimale en dimension un et le deuxième sur des problèmes en dimension deux ou plus. Notons bien que, dans cette thèse, nous avons supposé que le facteur temps n'intervient pas. Dans le premier chapitre, nous calculons, au début, l'équation de programmation dynamique pour la valeur minimale F de l'espérance mathématique de la fonction de coût considérée. Ensuite, nous utilisons le théorème de Whittle qui est applicable seulement si une condition entre le bruit blanc v et les termes b et q associés à la commande est satisfaite. Sinon, nous procédons autrement. En effet, un changement de variable transforme notre équation en une équation de Riccati en G= F', mais sans conditions initiales. Dans certains cas, à partir de la symétrie des paramètres infinitésimaux et de q, nous pouvons en déduire le point x' où G(x')=0. Si ce n'est pas le cas, nous nous limitons à des bonnes approximations. Cette même démarche est toujours possible si nous sommes dans des situations particulières, par exemple, lorsque nous avons une seule barrière. Dans le deuxième chapitre, nous traitons les problèmes en dimension deux ou plus. Puisque la condition de Whittle est difficile à satisfaire dans ce cas, nous essayons de généraliser les résultats du premier chapitre. Nous utilisons alors dans quelques exemples la méthode des similitudes, qui permet de transformer le problème en dimension un. Ensuite, nous proposons une nouvelle méthode de résolution. Cette dernière linéarise l'équation de programmation dynamique qui est une équation aux dérivées partielles non linéaire. Il reste à la fin à trouver les conditions initiales pour la nouvelle fonction et aussi à vérifier que les n expressions obtenues pour F sont équivalentes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this thesis is to narrow the gap between two different control techniques: the continuous control and the discrete event control techniques DES. This gap can be reduced by the study of Hybrid systems, and by interpreting as Hybrid systems the majority of large-scale systems. In particular, when looking deeply into a process, it is often possible to identify interaction between discrete and continuous signals. Hybrid systems are systems that have both continuous, and discrete signals. Continuous signals are generally supposed continuous and differentiable in time, since discrete signals are neither continuous nor differentiable in time due to their abrupt changes in time. Continuous signals often represent the measure of natural physical magnitudes such as temperature, pressure etc. The discrete signals are normally artificial signals, operated by human artefacts as current, voltage, light etc. Typical processes modelled as Hybrid systems are production systems, chemical process, or continuos production when time and continuous measures interacts with the transport, and stock inventory system. Complex systems as manufacturing lines are hybrid in a global sense. They can be decomposed into several subsystems, and their links. Another motivation for the study of Hybrid systems is the tools developed by other research domains. These tools benefit from the use of temporal logic for the analysis of several properties of Hybrid systems model, and use it to design systems and controllers, which satisfies physical or imposed restrictions. This thesis is focused in particular types of systems with discrete and continuous signals in interaction. That can be modelled hard non-linealities, such as hysteresis, jumps in the state, limit cycles, etc. and their possible non-deterministic future behaviour expressed by an interpretable model description. The Hybrid systems treated in this work are systems with several discrete states, always less than thirty states (it can arrive to NP hard problem), and continuous dynamics evolving with expression: with Ki ¡ Rn constant vectors or matrices for X components vector. In several states the continuous evolution can be several of them Ki = 0. In this formulation, the mathematics can express Time invariant linear system. By the use of this expression for a local part, the combination of several local linear models is possible to represent non-linear systems. And with the interaction with discrete events of the system the model can compose non-linear Hybrid systems. Especially multistage processes with high continuous dynamics are well represented by the proposed methodology. Sate vectors with more than two components, as third order models or higher is well approximated by the proposed approximation. Flexible belt transmission, chemical reactions with initial start-up and mobile robots with important friction are several physical systems, which profits from the benefits of proposed methodology (accuracy). The motivation of this thesis is to obtain a solution that can control and drive the Hybrid systems from the origin or starting point to the goal. How to obtain this solution, and which is the best solution in terms of one cost function subject to the physical restrictions and control actions is analysed. Hybrid systems that have several possible states, different ways to drive the system to the goal and different continuous control signals are problems that motivate this research. The requirements of the system on which we work is: a model that can represent the behaviour of the non-linear systems, and that possibilities the prediction of possible future behaviour for the model, in order to apply an supervisor which decides the optimal and secure action to drive the system toward the goal. Specific problems can be determined by the use of this kind of hybrid models are: - The unity of order. - Control the system along a reachable path. - Control the system in a safe path. - Optimise the cost function. - Modularity of control The proposed model solves the specified problems in the switching models problem, the initial condition calculus and the unity of the order models. Continuous and discrete phenomena are represented in Linear hybrid models, defined with defined eighth-tuple parameters to model different types of hybrid phenomena. Applying a transformation over the state vector : for LTI system we obtain from a two-dimensional SS a single parameter, alpha, which still maintains the dynamical information. Combining this parameter with the system output, a complete description of the system is obtained in a form of a graph in polar representation. Using Tagaki-Sugeno type III is a fuzzy model which include linear time invariant LTI models for each local model, the fuzzyfication of different LTI local model gives as a result a non-linear time invariant model. In our case the output and the alpha measure govern the membership function. Hybrid systems control is a huge task, the processes need to be guided from the Starting point to the desired End point, passing a through of different specific states and points in the trajectory. The system can be structured in different levels of abstraction and the control in three layers for the Hybrid systems from planning the process to produce the actions, these are the planning, the process and control layer. In this case the algorithms will be applied to robotics ¡V a domain where improvements are well accepted ¡V it is expected to find a simple repetitive processes for which the extra effort in complexity can be compensated by some cost reductions. It may be also interesting to implement some control optimisation to processes such as fuel injection, DC-DC converters etc. In order to apply the RW theory of discrete event systems on a Hybrid system, we must abstract the continuous signals and to project the events generated for these signals, to obtain new sets of observable and controllable events. Ramadge & Wonham¡¦s theory along with the TCT software give a Controllable Sublanguage of the legal language generated for a Discrete Event System (DES). Continuous abstraction transforms predicates over continuous variables into controllable or uncontrollable events, and modifies the set of uncontrollable, controllable observable and unobservable events. Continuous signals produce into the system virtual events, when this crosses the bound limits. If this event is deterministic, they can be projected. It is necessary to determine the controllability of this event, in order to assign this to the corresponding set, , controllable, uncontrollable, observable and unobservable set of events. Find optimal trajectories in order to minimise some cost function is the goal of the modelling procedure. Mathematical model for the system allows the user to apply mathematical techniques over this expression. These possibilities are, to minimise a specific cost function, to obtain optimal controllers and to approximate a specific trajectory. The combination of the Dynamic Programming with Bellman Principle of optimality, give us the procedure to solve the minimum time trajectory for Hybrid systems. The problem is greater when there exists interaction between adjacent states. In Hybrid systems the problem is to determine the partial set points to be applied at the local models. Optimal controller can be implemented in each local model in order to assure the minimisation of the local costs. The solution of this problem needs to give us the trajectory to follow the system. Trajectory marked by a set of set points to force the system to passing over them. Several ways are possible to drive the system from the Starting point Xi to the End point Xf. Different ways are interesting in: dynamic sense, minimum states, approximation at set points, etc. These ways need to be safe and viable and RchW. And only one of them must to be applied, normally the best, which minimises the proposed cost function. A Reachable Way, this means the controllable way and safe, will be evaluated in order to obtain which one minimises the cost function. Contribution of this work is a complete framework to work with the majority Hybrid systems, the procedures to model, control and supervise are defined and explained and its use is demonstrated. Also explained is the procedure to model the systems to be analysed for automatic verification. Great improvements were obtained by using this methodology in comparison to using other piecewise linear approximations. It is demonstrated in particular cases this methodology can provide best approximation. The most important contribution of this work, is the Alpha approximation for non-linear systems with high dynamics While this kind of process is not typical, but in this case the Alpha approximation is the best linear approximation to use, and give a compact representation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MOP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional ""flat"" dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Non-invasive spatial activity recognition is a difficult task, complicated by variation in how the same activities are conducted and furthermore by noise introduced by video tracking procedures. In this paper we propose an algorithm based on dynamic time warping (DTW) as a viable method with which to quantify segmented spatial activity sequences from a video tracking system. DTW is a widely used technique for optimally aligning or warping temporal sequences through minimisation of the distance between their components. The proposed algorithm threshold DTW (TDTW) is capable of accurate spatial sequence distance quantification and is shown using a three class spatial data set to be more robust and accurate than DTW and the discrete hidden markov model (HMM). We also evaluate the application of a band dynamic programming (DP) constraint to TDTW in order to reduce extraneous warping between sequences and to reduce the computation complexity of the approach. Results show that application of a band DP constraint to TDTW improves runtime performance significantly, whilst still maintaining a high precision and recall.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents necessary and sufficient conditions for the following problem: given a linear time invariant plant G(s) = N(s)D(s)-1 = C(sI - A]-1B, with m inputs, p outputs, p > m, rank(C) = p, rank(B) = rank(CB) = m, £nd a tandem dynamic controller Gc(s) = D c(s)-1Nc(s) = Cc(sI - A c)-1Bc + Dc, with p inputs and m outputs and a constant output feedback matrix Ko ε ℝm×p such that the feedback system is Strictly Positive Real (SPR). It is shown that this problem has solution if and only if all transmission zeros of the plant have negative real parts. When there exists solution, the proposed method firstly obtains Gc(s) in order to all transmission zeros of Gc(s)G(s) present negative real parts and then Ko is found as the solution of some Linear Matrix Inequalities (LMIs). Then, taking into account this result, a new LMI based design for output Variable Structure Control (VSC) of uncertain dynamic plants is presented. The method can consider the following design specifications: matched disturbances or nonlinearities of the plant, output constraints, decay rate and matched and nonmatched plant uncertainties. © 2006 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

"January, 1971."

Relevância:

90.00% 90.00%

Publicador:

Resumo:

One of the most pressing issues facing the global conservation community is how to distribute limited resources between regions identified as priorities for biodiversity conservation(1-3). Approaches such as biodiversity hotspots(4), endemic bird areas(5) and ecoregions(6) are used by international organizations to prioritize conservation efforts globally(7). Although identifying priority regions is an important first step in solving this problem, it does not indicate how limited resources should be allocated between regions. Here we formulate how to allocate optimally conservation resources between regions identified as priorities for conservation - the 'conservation resource allocation problem'. Stochastic dynamic programming is used to find the optimal schedule of resource allocation for small problems but is intractable for large problems owing to the curse of dimensionality(8). We identify two easy- to- use and easy- to- interpret heuristics that closely approximate the optimal solution. We also show the importance of both correctly formulating the problem and using information on how investment returns change through time. Our conservation resource allocation approach can be applied at any spatial scale. We demonstrate the approach with an example of optimal resource allocation among five priority regions in Wallacea and Sundaland, the transition zone between Asia and Australasia.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A novel algorithm for performing registration of dynamic contrast-enhanced (DCE) MRI data of the breast is presented. It is based on an algorithm known as iterated dynamic programming originally devised to solve the stereo matching problem. Using artificially distorted DCE-MRI breast images it is shown that the proposed algorithm is able to correct for movement and distortions over a larger range than is likely to occur during routine clinical examination. In addition, using a clinical DCE-MRI data set with an expertly labeled suspicious region, it is shown that the proposed algorithm significantly reduces the variability of the enhancement curves at the pixel level yielding more pronounced uptake and washout phases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Optimal stochastic controller pushes the closed-loop behavior as close as possible to the desired one. The fully probabilistic design (FPD) uses probabilistic description of the desired closed loop and minimizes Kullback-Leibler divergence of the closed-loop description to the desired one. Practical exploitation of the fully probabilistic design control theory continues to be hindered by the computational complexities involved in numerically solving the associated stochastic dynamic programming problem. In particular very hard multivariate integration and an approximate interpolation of the involved multivariate functions. This paper proposes a new fully probabilistic contro algorithm that uses the adaptive critic methods to circumvent the need for explicitly evaluating the optimal value function, thereby dramatically reducing computational requirements. This is a main contribution of this short paper.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A novel approach of normal ECG recognition based on scale-space signal representation is proposed. The approach utilizes curvature scale-space signal representation used to match visual objects shapes previously and dynamic programming algorithm for matching CSS representations of ECG signals. Extraction and matching processes are fast and experimental results show that the approach is quite robust for preliminary normal ECG recognition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Adaptive critic methods have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, nonlinear and nonstationary environments. In this study, a novel probabilistic dual heuristic programming (DHP) based adaptive critic controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) adaptive critic method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterized by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the critic network is then calculated and shown to be equal to the analytically derived correct value.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

AMS subject classification: 68Q22, 90C90

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dwell times at stations and inter-station run times are the two major operational parameters to maintain train schedule in railway service. Current practices on dwell-time and run-time control are that they are only optimal with respect to certain nominal traffic conditions, but not necessarily the current service demand. The advantages of dwell-time and run-time control on trains are therefore not fully considered. The application of a dynamic programming approach, with the aid of an event-based model, to devise an optimal set of dwell times and run times for trains under given operational constraints over a regional level is presented. Since train operation is interactive and of multi-attributes, dwell-time and run-time coordination among trains is a multi-dimensional problem. The computational demand on devising trains' instructions, a prime concern in real-time applications, is excessively high. To properly reduce the computational demand in the provision of appropriate dwell times and run times for trains, a DC railway line is divided into a number of regions and each region is controlled by a dwell- time and run-time controller. The performance and feasibility of the controller in formulating the dwell-time and run-time solutions for real-time applications are demonstrated through simulations.