137 resultados para Optimal control


Relevância:

70.00% 70.00%

Publicador:

Resumo:

A new computational tool is presented in this paper for suboptimal control design of a class of nonlinear distributed parameter systems. First proper orthogonal decomposition based problem-oriented basis functions are designed, which are then used in a Galerkin projection to come up with a low-order lumped parameter approximation. Next, a suboptimal controller is designed using the emerging /spl thetas/-D technique for lumped parameter systems. This time domain sub-optimal control solution is then mapped back to the distributed domain using the same basis functions, which essentially leads to a closed form solution for the controller in a state feedback form. Numerical results for a real-life nonlinear temperature control problem indicate that the proposed method holds promise as a good suboptimal control design technique for distributed parameter systems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Diabetes is a long-term disease during which the body's production and use of insulin are impaired, causing glucose concentration level to increase in the bloodstream. Regulating blood glucose levels as close to normal as possible leads to a substantial decrease in long-term complications of diabetes. In this paper, an intelligent online feedback-treatment strategy is presented for the control of blood glucose levels in diabetic patients using single network adaptive critic (SNAC) neural networks (which is based on nonlinear optimal control theory). A recently developed mathematical model of the nonlinear dynamics of glucose and insulin interaction in the blood system has been revised and considered for synthesizing the neural network for feedback control. The idea is to replicate the function of pancreatic insulin, i.e. to have a fairly continuous measurement of blood glucose and a situation-dependent insulin injection to the body using an external device. Detailed studies are carried out to analyze the effectiveness of this adaptive critic-based feedback medication strategy. A comparison study with linear quadratic regulator (LQR) theory shows that the proposed nonlinear approach offers some important advantages such as quicker response, avoidance of hypoglycemia problems, etc. Robustness of the proposed approach is also demonstrated from a large number of simulations considering random initial conditions and parametric uncertainties. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We study the trade-off between delivery delay and energy consumption in delay tolerant mobile wireless networks that use two-hop relaying. The source may not have perfect knowledge of the delivery status at every instant. We formulate the problem as a stochastic control problem with partial information, and study structural properties of the optimal policy. We also propose a simple suboptimal policy. We then compare the performance of the suboptimal policy against that of the optimal control with perfect information. These are bounds on the performance of the proposed policy with partial information. Several other related open loop policies are also compared with these bounds.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider the problem of quickest detection of an intrusion using a sensor network, keeping only a minimal number of sensors active. By using a minimal number of sensor devices,we ensure that the energy expenditure for sensing, computation and communication is minimized (and the lifetime of the network is maximized). We model the intrusion detection (or change detection) problem as a Markov decision process (MDP). Based on the theory of MDP, we develop the following closed loop sleep/wake scheduling algorithms: 1) optimal control of Mk+1, the number of sensors in the wake state in time slot k + 1, 2) optimal control of qk+1, the probability of a sensor in the wake state in time slot k + 1, and an open loop sleep/wake scheduling algorithm which 3) computes q, the optimal probability of a sensor in the wake state (which does not vary with time),based on the sensor observations obtained until time slot k.Our results show that an optimum closed loop control onMk+1 significantly decreases the cost compared to keeping any number of sensors active all the time. Also, among the three algorithms described, we observe that the total cost is minimum for the optimum control on Mk+1 and is maximum for the optimum open loop control on q.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we incorporate a novel approach to synthesize a class of closed-loop feedback control, based on the variational structure assignment. Properties of a viscoelastic system are used to design an active feedback controller for an undamped structural system with distributed sensor, actuator and controller. Wave dispersion properties of onedimensional beam system have been studied. Efficiency of the chosen viscoelastic model in enhancing damping and stability properties of one-dimensional viscoelastic bar have been analyzed. The variational structure is projected on a solution space of a closed-loop system involving a weakly damped structure with distributed sensor and actuator with controller. These assign the phenomenology based internal strain rate damping parameter of a viscoelastic system to the usual elastic structure but with active control. In the formulation a model of cantilever beam with non-collocated actuator and sensor has been considered. The formulation leads to the matrix identification problem of two dynamic stiffness matrices. The method has been simplified to obtain control system gains for the free vibration control of a cantilever beam system with collocated actuator-sensor, using quadratic optimal control and pole-placement methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An optimal control law for a general nonlinear system can be obtained by solving Hamilton-Jacobi-Bellman equation. However, it is difficult to obtain an analytical solution of this equation even for a moderately complex system. In this paper, we propose a continuoustime single network adaptive critic scheme for nonlinear control affine systems where the optimal cost-to-go function is approximated using a parametric positive semi-definite function. Unlike earlier approaches, a continuous-time weight update law is derived from the HJB equation. The stability of the system is analysed during the evolution of weights using Lyapunov theory. The effectiveness of the scheme is demonstrated through simulation examples.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal preventive maintenance policies, for a machine subject to deterioration with age and intermittent breakdowns and repairs, are derived using optimal control theory. The optimal policies are shown to be of bang-bang nature. The extension to the case when there are a large number of identical machines and several repairmen in the system is considered next. This model takes into account the waiting line formed at the repair facility and establishes a link between this problem and the classical ``repairmen problem.''

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal maintenance policies for a machine with degradation in performance with age and subject to failure are derived using optimal control theory. The optimal policies are shown to be, normally, of bang-coast nature, except in the case when probability of machine failure is a function of maintenance. It is also shown, in the deterministic case that a higher depreciation rate tends to reverse this policy to coast-bang. When the probability of failure is a function of maintenance, considerable computational effort is needed to obtain an optimal policy and the resulting policy is not easily implementable. For this case also, an optimal policy in the class of bang-coast policies is derived, using a semi-Markov decision model. A simple procedure for modifying the probability of machine failure with maintenance is employed. The results obtained extend and unify the recent results for this problem along both theoretical and practical lines. Numerical examples are presented to illustrate the results obtained.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents the design and implementation of a learning controller for the Automatic Generation Control (AGC) in power systems based on a reinforcement learning (RL) framework. In contrast to the recent RL scheme for AGC proposed by us, the present method permits handling of power system variables such as Area Control Error (ACE) and deviations from scheduled frequency and tie-line flows as continuous variables. (In the earlier scheme, these variables have to be quantized into finitely many levels). The optimal control law is arrived at in the RL framework by making use of Q-learning strategy. Since the state variables are continuous, we propose the use of Radial Basis Function (RBF) neural networks to compute the Q-values for a given input state. Since, in this application we cannot provide training data appropriate for the standard supervised learning framework, a reinforcement learning algorithm is employed to train the RBF network. We also employ a novel exploration strategy, based on a Learning Automata algorithm,for generating training samples during Q-learning. The proposed scheme, in addition to being simple to implement, inherits all the attractive features of an RL scheme such as model independent design, flexibility in control objective specification, robustness etc. Two implementations of the proposed approach are presented. Through simulation studies the attractiveness of this approach is demonstrated.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider the problem of devising incentive strategies for viral marketing of a product. In particular, we assume that the seller can influence penetration of the product by offering two incentive programs: a) direct incentives to potential buyers (influence) and b) referral rewards for customers who influence potential buyers to make the purchase (exploit connections). The problem is to determine the optimal timing of these programs over a finite time horizon. In contrast to algorithmic perspective popular in the literature, we take a mean-field approach and formulate the problem as a continuous-time deterministic optimal control problem. We show that the optimal strategy for the seller has a simple structure and can take both forms, namely, influence-and-exploit and exploit-and-influence. We also show that in some cases it may optimal for the seller to deploy incentive programs mostly for low degree nodes. We support our theoretical results through numerical studies and provide practical insights by analyzing various scenarios.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an advanced single network adaptive critic (SNAC) aided nonlinear dynamic inversion (NDI) approach for simultaneous attitude control and trajectory tracking of a micro-quadrotor. Control of micro-quadrotors is a challenging problem due to its small size, strong coupling in pitch-yaw-roll and aerodynamic effects that often need to be ignored in the control design process to avoid mathematical complexities. In the proposed SNAC aided NDI approach, the gains of the dynamic inversion design are selected in such a way that the resulting controller behaves closely to a pre-synthesized SNAC controller for the output regulation problem. However, since SNAC is based on optimal control theory, it makes the dynamic inversion controller to operate near optimal and enhances its robustness property as well. More important, it retains two major benefits of dynamic inversion: (i) closed form expression of the controller and (ii) easy scalability to command tracking application even without any apriori knowledge of the reference command. Effectiveness of the proposed controller is demonstrated from six degree-of-freedom simulation studies of a micro-quadrotor. It has also been observed that the proposed SNAC aided NDI approach is more robust to modeling inaccuracies, as compared to the NDI controller designed independently from time domain specifications.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We study risk-sensitive control of continuous time Markov chains taking values in discrete state space. We study both finite and infinite horizon problems. In the finite horizon problem we characterize the value function via Hamilton Jacobi Bellman equation and obtain an optimal Markov control. We do the same for infinite horizon discounted cost case. In the infinite horizon average cost case we establish the existence of an optimal stationary control under certain Lyapunov condition. We also develop a policy iteration algorithm for finding an optimal control.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

To combine the advantages of both stability and optimality-based designs, a single network adaptive critic (SNAC) aided nonlinear dynamic inversion approach is presented in this paper. Here, the gains of a dynamic inversion controller are selected in such a way that the resulting controller behaves very close to a pre-synthesized SNAC controller in the output regulation sense. Because SNAC is based on optimal control theory, it makes the dynamic inversion controller operate nearly optimal. More important, it retains the two major benefits of dynamic inversion, namely (i) a closed-form expression of the controller and (ii) easy scalability to command tracking applications without knowing the reference commands a priori. An extended architecture is also presented in this paper that adapts online to system modeling and inversion errors, as well as reduced control effectiveness, thereby leading to enhanced robustness. The strengths of this hybrid method of applying SNAC to optimize an nonlinear dynamic inversion controller is demonstrated by considering a benchmark problem in robotics, that is, a two-link robotic manipulator system. Copyright (C) 2013 John Wiley & Sons, Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal control of traffic lights at junctions or traffic signal control (TSC) is essential for reducing the average delay experienced by the road users amidst the rapid increase in the usage of vehicles. In this paper, we formulate the TSC problem as a discounted cost Markov decision process (MDP) and apply multi-agent reinforcement learning (MARL) algorithms to obtain dynamic TSC policies. We model each traffic signal junction as an independent agent. An agent decides the signal duration of its phases in a round-robin (RR) manner using multi-agent Q-learning with either is an element of-greedy or UCB 3] based exploration strategies. It updates its Q-factors based on the cost feedback signal received from its neighbouring agents. This feedback signal can be easily constructed and is shown to be effective in minimizing the average delay of the vehicles in the network. We show through simulations over VISSIM that our algorithms perform significantly better than both the standard fixed signal timing (FST) algorithm and the saturation balancing (SAT) algorithm 15] over two real road networks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The recently developed single network adaptive critic (SNAC) design has been used in this study to design a power system stabiliser (PSS) for enhancing the small-signal stability of power systems over a wide range of operating conditions. PSS design is formulated as a discrete non-linear quadratic regulator problem. SNAC is then used to solve the resulting discrete-time optimal control problem. SNAC uses only a single critic neural network instead of the action-critic dual network architecture of typical adaptive critic designs. SNAC eliminates the iterative training loops between the action and critic networks and greatly simplifies the training procedure. The performance of the proposed PSS has been tested on a single machine infinite bus test system for various system and loading conditions. The proposed stabiliser, which is relatively easier to synthesise, consistently outperformed stabilisers based on conventional lead-lag and linear quadratic regulator designs.