884 resultados para Optimal control problem


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pós-graduação em Engenharia Elétrica - FEIS

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pós-graduação em Engenharia Elétrica - FEIS

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work addresses the solution to the problem of robust model predictive control (MPC) of systems with model uncertainty. The case of zone control of multi-variable stable systems with multiple time delays is considered. The usual approach of dealing with this kind of problem is through the inclusion of non-linear cost constraint in the control problem. The control action is then obtained at each sampling time as the solution to a non-linear programming (NLP) problem that for high-order systems can be computationally expensive. Here, the robust MPC problem is formulated as a linear matrix inequality problem that can be solved in real time with a fraction of the computer effort. The proposed approach is compared with the conventional robust MPC and tested through the simulation of a reactor system of the process industry.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper studies the average control problem of discrete-time Markov Decision Processes (MDPs for short) with general state space, Feller transition probabilities, and possibly non-compact control constraint sets A(x). Two hypotheses are considered: either the cost function c is strictly unbounded or the multifunctions A(r)(x) = {a is an element of A(x) : c(x, a) <= r} are upper-semicontinuous and compact-valued for each real r. For these two cases we provide new results for the existence of a solution to the average-cost optimality equality and inequality using the vanishing discount approach. We also study the convergence of the policy iteration approach under these conditions. It should be pointed out that we do not make any assumptions regarding the convergence and the continuity of the limit function generated by the sequence of relative difference of the alpha-discounted value functions and the Poisson equations as often encountered in the literature. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this work we are concerned with the analysis and numerical solution of Black-Scholes type equations arising in the modeling of incomplete financial markets and an inverse problem of determining the local volatility function in a generalized Black-Scholes model from observed option prices. In the first chapter a fully nonlinear Black-Scholes equation which models transaction costs arising in option pricing is discretized by a new high order compact scheme. The compact scheme is proved to be unconditionally stable and non-oscillatory and is very efficient compared to classical schemes. Moreover, it is shown that the finite difference solution converges locally uniformly to the unique viscosity solution of the continuous equation. In the next chapter we turn to the calibration problem of computing local volatility functions from market data in a generalized Black-Scholes setting. We follow an optimal control approach in a Lagrangian framework. We show the existence of a global solution and study first- and second-order optimality conditions. Furthermore, we propose an algorithm that is based on a globalized sequential quadratic programming method and a primal-dual active set strategy, and present numerical results. In the last chapter we consider a quasilinear parabolic equation with quadratic gradient terms, which arises in the modeling of an optimal portfolio in incomplete markets. The existence of weak solutions is shown by considering a sequence of approximate solutions. The main difficulty of the proof is to infer the strong convergence of the sequence. Furthermore, we prove the uniqueness of weak solutions under a smallness condition on the derivatives of the covariance matrices with respect to the solution, but without additional regularity assumptions on the solution. The results are illustrated by a numerical example.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis deals with distributed control strategies for cooperative control of multi-robot systems. Specifically, distributed coordination strategies are presented for groups of mobile robots. The formation control problem is initially solved exploiting artificial potential fields. The purpose of the presented formation control algorithm is to drive a group of mobile robots to create a completely arbitrarily shaped formation. Robots are initially controlled to create a regular polygon formation. A bijective coordinate transformation is then exploited to extend the scope of this strategy, to obtain arbitrarily shaped formations. For this purpose, artificial potential fields are specifically designed, and robots are driven to follow their negative gradient. Artificial potential fields are then subsequently exploited to solve the coordinated path tracking problem, thus making the robots autonomously spread along predefined paths, and move along them in a coordinated way. Formation control problem is then solved exploiting a consensus based approach. Specifically, weighted graphs are used both to define the desired formation, and to implement collision avoidance. As expected for consensus based algorithms, this control strategy is experimentally shown to be robust to the presence of communication delays. The global connectivity maintenance issue is then considered. Specifically, an estimation procedure is introduced to allow each agent to compute its own estimate of the algebraic connectivity of the communication graph, in a distributed manner. This estimate is then exploited to develop a gradient based control strategy that ensures that the communication graph remains connected, as the system evolves. The proposed control strategy is developed initially for single-integrator kinematic agents, and is then extended to Lagrangian dynamical systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Die Arbeit behandelt das Problem der Skalierbarkeit von Reinforcement Lernen auf hochdimensionale und komplexe Aufgabenstellungen. Unter Reinforcement Lernen versteht man dabei eine auf approximativem Dynamischen Programmieren basierende Klasse von Lernverfahren, die speziell Anwendung in der Künstlichen Intelligenz findet und zur autonomen Steuerung simulierter Agenten oder realer Hardwareroboter in dynamischen und unwägbaren Umwelten genutzt werden kann. Dazu wird mittels Regression aus Stichproben eine Funktion bestimmt, die die Lösung einer "Optimalitätsgleichung" (Bellman) ist und aus der sich näherungsweise optimale Entscheidungen ableiten lassen. Eine große Hürde stellt dabei die Dimensionalität des Zustandsraums dar, die häufig hoch und daher traditionellen gitterbasierten Approximationsverfahren wenig zugänglich ist. Das Ziel dieser Arbeit ist es, Reinforcement Lernen durch nichtparametrisierte Funktionsapproximation (genauer, Regularisierungsnetze) auf -- im Prinzip beliebig -- hochdimensionale Probleme anwendbar zu machen. Regularisierungsnetze sind eine Verallgemeinerung von gewöhnlichen Basisfunktionsnetzen, die die gesuchte Lösung durch die Daten parametrisieren, wodurch die explizite Wahl von Knoten/Basisfunktionen entfällt und so bei hochdimensionalen Eingaben der "Fluch der Dimension" umgangen werden kann. Gleichzeitig sind Regularisierungsnetze aber auch lineare Approximatoren, die technisch einfach handhabbar sind und für die die bestehenden Konvergenzaussagen von Reinforcement Lernen Gültigkeit behalten (anders als etwa bei Feed-Forward Neuronalen Netzen). Allen diesen theoretischen Vorteilen gegenüber steht allerdings ein sehr praktisches Problem: der Rechenaufwand bei der Verwendung von Regularisierungsnetzen skaliert von Natur aus wie O(n**3), wobei n die Anzahl der Daten ist. Das ist besonders deswegen problematisch, weil bei Reinforcement Lernen der Lernprozeß online erfolgt -- die Stichproben werden von einem Agenten/Roboter erzeugt, während er mit der Umwelt interagiert. Anpassungen an der Lösung müssen daher sofort und mit wenig Rechenaufwand vorgenommen werden. Der Beitrag dieser Arbeit gliedert sich daher in zwei Teile: Im ersten Teil der Arbeit formulieren wir für Regularisierungsnetze einen effizienten Lernalgorithmus zum Lösen allgemeiner Regressionsaufgaben, der speziell auf die Anforderungen von Online-Lernen zugeschnitten ist. Unser Ansatz basiert auf der Vorgehensweise von Recursive Least-Squares, kann aber mit konstantem Zeitaufwand nicht nur neue Daten sondern auch neue Basisfunktionen in das bestehende Modell einfügen. Ermöglicht wird das durch die "Subset of Regressors" Approximation, wodurch der Kern durch eine stark reduzierte Auswahl von Trainingsdaten approximiert wird, und einer gierigen Auswahlwahlprozedur, die diese Basiselemente direkt aus dem Datenstrom zur Laufzeit selektiert. Im zweiten Teil übertragen wir diesen Algorithmus auf approximative Politik-Evaluation mittels Least-Squares basiertem Temporal-Difference Lernen, und integrieren diesen Baustein in ein Gesamtsystem zum autonomen Lernen von optimalem Verhalten. Insgesamt entwickeln wir ein in hohem Maße dateneffizientes Verfahren, das insbesondere für Lernprobleme aus der Robotik mit kontinuierlichen und hochdimensionalen Zustandsräumen sowie stochastischen Zustandsübergängen geeignet ist. Dabei sind wir nicht auf ein Modell der Umwelt angewiesen, arbeiten weitestgehend unabhängig von der Dimension des Zustandsraums, erzielen Konvergenz bereits mit relativ wenigen Agent-Umwelt Interaktionen, und können dank des effizienten Online-Algorithmus auch im Kontext zeitkritischer Echtzeitanwendungen operieren. Wir demonstrieren die Leistungsfähigkeit unseres Ansatzes anhand von zwei realistischen und komplexen Anwendungsbeispielen: dem Problem RoboCup-Keepaway, sowie der Steuerung eines (simulierten) Oktopus-Tentakels.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a novel variable decomposition approach for pose recovery of the distal locking holes using single calibrated fluoroscopic image. The problem is formulated as a model-based optimal fitting process, where the control variables are decomposed into two sets: (a) the angle between the nail axis and its projection on the imaging plane, and (b) the translation and rotation of the geometrical model of the distal locking hole around the nail axis. By using an iterative algorithm to find the optimal values of the latter set of variables for any given value of the former variable, we reduce the multiple-dimensional model-based optimal fitting problem to a one-dimensional search along a finite interval. We report the results of our in vitro experiments, which demonstrate that the accuracy of our approach is adequate for successful distal locking of intramedullary nails.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In power electronic basedmicrogrids, the computational requirements needed to implement an optimized online control strategy can be prohibitive. The work presented in this dissertation proposes a generalized method of derivation of geometric manifolds in a dc microgrid that is based on the a-priori computation of the optimal reactions and trajectories for classes of events in a dc microgrid. The proposed states are the stored energies in all the energy storage elements of the dc microgrid and power flowing into them. It is anticipated that calculating a large enough set of dissimilar transient scenarios will also span many scenarios not specifically used to develop the surface. These geometric manifolds will then be used as reference surfaces in any type of controller, such as a sliding mode hysteretic controller. The presence of switched power converters in microgrids involve different control actions for different system events. The control of the switch states of the converters is essential for steady state and transient operations. A digital memory look-up based controller that uses a hysteretic sliding mode control strategy is an effective technique to generate the proper switch states for the converters. An example dcmicrogrid with three dc-dc boost converters and resistive loads is considered for this work. The geometric manifolds are successfully generated for transient events, such as step changes in the loads and the sources. The surfaces corresponding to a specific case of step change in the loads are then used as reference surfaces in an EEPROM for experimentally validating the control strategy. The required switch states corresponding to this specific transient scenario are programmed in the EEPROM as a memory table. This controls the switching of the dc-dc boost converters and drives the system states to the reference manifold. In this work, it is shown that this strategy effectively controls the system for a transient condition such as step changes in the loads for the example case.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The study investigates the role of credit risk in a continuous time stochastic asset allocation model, since the traditional dynamic framework does not provide credit risk flexibility. The general model of the study extends the traditional dynamic efficiency framework by explicitly deriving the optimal value function for the infinite horizon stochastic control problem via a weighted volatility measure of market and credit risk. The model's optimal strategy was then compared to that obtained from a benchmark Markowitz-type dynamic optimization framework to determine which specification adequately reflects the optimal terminal investment returns and strategy under credit and market risks. The paper shows that an investor's optimal terminal return is lower than typically indicated under the traditional mean-variance framework during periods of elevated credit risk. Hence I conclude that, while the traditional dynamic mean-variance approach may indicate the ideal, in the presence of credit-risk it does not accurately reflect the observed optimal returns, terminal wealth and portfolio selection strategies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper addresses an uplink power control dynamic game where we assume that each user battery represents the system state that changes with time following a discrete-time version of a differential game. To overcome the complexity of the analysis of a dynamic game approach we focus on the concept of Dynamic Potential Games showing that the game can be solved as an equivalent Multivariate Optimum Control Problem. The solution of this problem is quite interesting because different users split the activity in time, avoiding higher interferences and providing a long term fairness.