953 resultados para HJB equation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

An optimal control law for a general nonlinear system can be obtained by solving Hamilton-Jacobi-Bellman equation. However, it is difficult to obtain an analytical solution of this equation even for a moderately complex system. In this paper, we propose a continuoustime single network adaptive critic scheme for nonlinear control affine systems where the optimal cost-to-go function is approximated using a parametric positive semi-definite function. Unlike earlier approaches, a continuous-time weight update law is derived from the HJB equation. The stability of the system is analysed during the evolution of weights using Lyapunov theory. The effectiveness of the scheme is demonstrated through simulation examples.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Hamilton Jacobi Bellman (HJB) equation is central to stochastic optimal control (SOC) theory, yielding the optimal solution to general problems specified by known dynamics and a specified cost functional. Given the assumption of quadratic cost on the control input, it is well known that the HJB reduces to a particular partial differential equation (PDE). While powerful, this reduction is not commonly used as the PDE is of second order, is nonlinear, and examples exist where the problem may not have a solution in a classical sense. Furthermore, each state of the system appears as another dimension of the PDE, giving rise to the curse of dimensionality. Since the number of degrees of freedom required to solve the optimal control problem grows exponentially with dimension, the problem becomes intractable for systems with all but modest dimension.

In the last decade researchers have found that under certain, fairly non-restrictive structural assumptions, the HJB may be transformed into a linear PDE, with an interesting analogue in the discretized domain of Markov Decision Processes (MDP). The work presented in this thesis uses the linearity of this particular form of the HJB PDE to push the computational boundaries of stochastic optimal control.

This is done by crafting together previously disjoint lines of research in computation. The first of these is the use of Sum of Squares (SOS) techniques for synthesis of control policies. A candidate polynomial with variable coefficients is proposed as the solution to the stochastic optimal control problem. An SOS relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function. It is shown that these results extend to arbitrary parabolic and elliptic PDEs, yielding a novel method for Uncertainty Quantification (UQ) of systems governed by partial differential constraints. Domain decomposition techniques are also made available, allowing for such problems to be solved via parallelization and low-order polynomials.

The optimization-based SOS technique is then contrasted with the Separated Representation (SR) approach from the applied mathematics community. The technique allows for systems of equations to be solved through a low-rank decomposition that results in algorithms that scale linearly with dimensionality. Its application in stochastic optimal control allows for previously uncomputable problems to be solved quickly, scaling to such complex systems as the Quadcopter and VTOL aircraft. This technique may be combined with the SOS approach, yielding not only a numerical technique, but also an analytical one that allows for entirely new classes of systems to be studied and for stability properties to be guaranteed.

The analysis of the linear HJB is completed by the study of its implications in application. It is shown that the HJB and a popular technique in robotics, the use of navigation functions, sit on opposite ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. Analytical solutions to the HJB in these settings are available in simplified domains, yielding guidance towards optimality for approximation schemes. Finally, the use of HJB equations in temporal multi-task planning problems is investigated. It is demonstrated that such problems are reducible to a sequence of SOC problems linked via boundary conditions. The linearity of the PDE allows us to pre-compute control policy primitives and then compose them, at essentially zero cost, to satisfy a complex temporal logic specification.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we consider nonautonomous optimal control problems of infinite horizon type, whose control actions are given by L-1-functions. We verify that the value function is locally Lipschitz. The equivalence between dynamic programming inequalities and Hamilton-Jacobi-Bellman (HJB) inequalities for proximal sub (super) gradients is proven. Using this result we show that the value function is a Dini solution of the HJB equation. We obtain a verification result for the class of Dini sub-solutions of the HJB equation and also prove a minimax property of the value function with respect to the sets of Dini semi-solutions of the HJB equation. We introduce the concept of viscosity solutions of the HJB equation in infinite horizon and prove the equivalence between this and the concept of Dini solutions. In the Appendix we provide an existence theorem. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ce mémoire de maîtrise traite de la théorie de la ruine, et plus spécialement des modèles actuariels avec surplus dans lesquels sont versés des dividendes. Nous étudions en détail un modèle appelé modèle gamma-omega, qui permet de jouer sur les moments de paiement de dividendes ainsi que sur une ruine non-standard de la compagnie. Plusieurs extensions de la littérature sont faites, motivées par des considérations liées à la solvabilité. La première consiste à adapter des résultats d’un article de 2011 à un nouveau modèle modifié grâce à l’ajout d’une contrainte de solvabilité. La seconde, plus conséquente, consiste à démontrer l’optimalité d’une stratégie de barrière pour le paiement des dividendes dans le modèle gamma-omega. La troisième concerne l’adaptation d’un théorème de 2003 sur l’optimalité des barrières en cas de contrainte de solvabilité, qui n’était pas démontré dans le cas des dividendes périodiques. Nous donnons aussi les résultats analogues à l’article de 2011 en cas de barrière sous la contrainte de solvabilité. Enfin, la dernière concerne deux différentes approches à adopter en cas de passage sous le seuil de ruine. Une liquidation forcée du surplus est mise en place dans un premier cas, en parallèle d’une liquidation à la première opportunité en cas de mauvaises prévisions de dividendes. Un processus d’injection de capital est expérimenté dans le deuxième cas. Nous étudions l’impact de ces solutions sur le montant des dividendes espérés. Des illustrations numériques sont proposées pour chaque section, lorsque cela s’avère pertinent.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ce mémoire de maîtrise traite de la théorie de la ruine, et plus spécialement des modèles actuariels avec surplus dans lesquels sont versés des dividendes. Nous étudions en détail un modèle appelé modèle gamma-omega, qui permet de jouer sur les moments de paiement de dividendes ainsi que sur une ruine non-standard de la compagnie. Plusieurs extensions de la littérature sont faites, motivées par des considérations liées à la solvabilité. La première consiste à adapter des résultats d’un article de 2011 à un nouveau modèle modifié grâce à l’ajout d’une contrainte de solvabilité. La seconde, plus conséquente, consiste à démontrer l’optimalité d’une stratégie de barrière pour le paiement des dividendes dans le modèle gamma-omega. La troisième concerne l’adaptation d’un théorème de 2003 sur l’optimalité des barrières en cas de contrainte de solvabilité, qui n’était pas démontré dans le cas des dividendes périodiques. Nous donnons aussi les résultats analogues à l’article de 2011 en cas de barrière sous la contrainte de solvabilité. Enfin, la dernière concerne deux différentes approches à adopter en cas de passage sous le seuil de ruine. Une liquidation forcée du surplus est mise en place dans un premier cas, en parallèle d’une liquidation à la première opportunité en cas de mauvaises prévisions de dividendes. Un processus d’injection de capital est expérimenté dans le deuxième cas. Nous étudions l’impact de ces solutions sur le montant des dividendes espérés. Des illustrations numériques sont proposées pour chaque section, lorsque cela s’avère pertinent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diffusion equations that use time fractional derivatives are attractive because they describe a wealth of problems involving non-Markovian Random walks. The time fractional diffusion equation (TFDE) is obtained from the standard diffusion equation by replacing the first-order time derivative with a fractional derivative of order α ∈ (0, 1). Developing numerical methods for solving fractional partial differential equations is a new research field and the theoretical analysis of the numerical methods associated with them is not fully developed. In this paper an explicit conservative difference approximation (ECDA) for TFDE is proposed. We give a detailed analysis for this ECDA and generate discrete models of random walk suitable for simulating random variables whose spatial probability density evolves in time according to this fractional diffusion equation. The stability and convergence of the ECDA for TFDE in a bounded domain are discussed. Finally, some numerical examples are presented to show the application of the present technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this research was to develop and test a multicausal model of the individual characteristics associated with academic success in first-year Australian university students. This model comprised the constructs of: previous academic performance, achievement motivation, self-regulatory learning strategies, and personality traits, with end-of-semester grades the dependent variable of interest. The study involved the distribution of a questionnaire, which assessed motivation, self-regulatory learning strategies and personality traits, to 1193 students at the start of their first year at university. Students' academic records were accessed at the end of their first year of study to ascertain their first and second semester grades. This study established that previous high academic performance, use of self-regulatory learning strategies, and being introverted and agreeable, were indicators of academic success in the first semester of university study. Achievement motivation and the personality trait of conscientiousness were indirectly related to first semester grades, through the influence they had on the students' use of self-regulatory learning strategies. First semester grades were predictive of second semester grades. This research provides valuable information for both educators and students about the factors intrinsic to the individual that are associated with successful performance in the first year at university.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider a time fractional diffusion equation on a finite domain. The equation is obtained from the standard diffusion equation by replacing the first-order time derivative by a fractional derivative (of order $0<\alpha<1$ ). We propose a computationally effective implicit difference approximation to solve the time fractional diffusion equation. Stability and convergence of the method are discussed. We prove that the implicit difference approximation (IDA) is unconditionally stable, and the IDA is convergent with $O(\tau+h^2)$, where $\tau$ and $h$ are time and space steps, respectively. Some numerical examples are presented to show the application of the present technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a singularly perturbed ordinary differential equation with non-smooth data is considered. The numerical method is generated by means of a Petrov-Galerkin finite element method with the piecewise-exponential test function and the piecewise-linear trial function. At the discontinuous point of the coefficient, a special technique is used. The method is shown to be first-order accurate and singular perturbation parameter uniform convergence. Finally, numerical results are presented, which are in agreement with theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a space fractional di®usion equation (SFDE) with non- homogeneous boundary conditions on a bounded domain is considered. A new matrix transfer technique (MTT) for solving the SFDE is proposed. The method is based on a matrix representation of the fractional-in-space operator and the novelty of this approach is that a standard discretisation of the operator leads to a system of linear ODEs with the matrix raised to the same fractional power. Analytic solutions of the SFDE are derived. Finally, some numerical results are given to demonstrate that the MTT is a computationally e±cient and accurate method for solving SFDE.