895 resultados para Optimal Control


Relevância:

70.00% 70.00%

Publicador:

Resumo:

We study the trade-off between delivery delay and energy consumption in delay tolerant mobile wireless networks that use two-hop relaying. The source may not have perfect knowledge of the delivery status at every instant. We formulate the problem as a stochastic control problem with partial information, and study structural properties of the optimal policy. We also propose a simple suboptimal policy. We then compare the performance of the suboptimal policy against that of the optimal control with perfect information. These are bounds on the performance of the proposed policy with partial information. Several other related open loop policies are also compared with these bounds.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider the problem of quickest detection of an intrusion using a sensor network, keeping only a minimal number of sensors active. By using a minimal number of sensor devices,we ensure that the energy expenditure for sensing, computation and communication is minimized (and the lifetime of the network is maximized). We model the intrusion detection (or change detection) problem as a Markov decision process (MDP). Based on the theory of MDP, we develop the following closed loop sleep/wake scheduling algorithms: 1) optimal control of Mk+1, the number of sensors in the wake state in time slot k + 1, 2) optimal control of qk+1, the probability of a sensor in the wake state in time slot k + 1, and an open loop sleep/wake scheduling algorithm which 3) computes q, the optimal probability of a sensor in the wake state (which does not vary with time),based on the sensor observations obtained until time slot k.Our results show that an optimum closed loop control onMk+1 significantly decreases the cost compared to keeping any number of sensors active all the time. Also, among the three algorithms described, we observe that the total cost is minimum for the optimum control on Mk+1 and is maximum for the optimum open loop control on q.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we incorporate a novel approach to synthesize a class of closed-loop feedback control, based on the variational structure assignment. Properties of a viscoelastic system are used to design an active feedback controller for an undamped structural system with distributed sensor, actuator and controller. Wave dispersion properties of onedimensional beam system have been studied. Efficiency of the chosen viscoelastic model in enhancing damping and stability properties of one-dimensional viscoelastic bar have been analyzed. The variational structure is projected on a solution space of a closed-loop system involving a weakly damped structure with distributed sensor and actuator with controller. These assign the phenomenology based internal strain rate damping parameter of a viscoelastic system to the usual elastic structure but with active control. In the formulation a model of cantilever beam with non-collocated actuator and sensor has been considered. The formulation leads to the matrix identification problem of two dynamic stiffness matrices. The method has been simplified to obtain control system gains for the free vibration control of a cantilever beam system with collocated actuator-sensor, using quadratic optimal control and pole-placement methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An optimal control law for a general nonlinear system can be obtained by solving Hamilton-Jacobi-Bellman equation. However, it is difficult to obtain an analytical solution of this equation even for a moderately complex system. In this paper, we propose a continuoustime single network adaptive critic scheme for nonlinear control affine systems where the optimal cost-to-go function is approximated using a parametric positive semi-definite function. Unlike earlier approaches, a continuous-time weight update law is derived from the HJB equation. The stability of the system is analysed during the evolution of weights using Lyapunov theory. The effectiveness of the scheme is demonstrated through simulation examples.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal preventive maintenance policies, for a machine subject to deterioration with age and intermittent breakdowns and repairs, are derived using optimal control theory. The optimal policies are shown to be of bang-bang nature. The extension to the case when there are a large number of identical machines and several repairmen in the system is considered next. This model takes into account the waiting line formed at the repair facility and establishes a link between this problem and the classical ``repairmen problem.''

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal maintenance policies for a machine with degradation in performance with age and subject to failure are derived using optimal control theory. The optimal policies are shown to be, normally, of bang-coast nature, except in the case when probability of machine failure is a function of maintenance. It is also shown, in the deterministic case that a higher depreciation rate tends to reverse this policy to coast-bang. When the probability of failure is a function of maintenance, considerable computational effort is needed to obtain an optimal policy and the resulting policy is not easily implementable. For this case also, an optimal policy in the class of bang-coast policies is derived, using a semi-Markov decision model. A simple procedure for modifying the probability of machine failure with maintenance is employed. The results obtained extend and unify the recent results for this problem along both theoretical and practical lines. Numerical examples are presented to illustrate the results obtained.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents the design and implementation of a learning controller for the Automatic Generation Control (AGC) in power systems based on a reinforcement learning (RL) framework. In contrast to the recent RL scheme for AGC proposed by us, the present method permits handling of power system variables such as Area Control Error (ACE) and deviations from scheduled frequency and tie-line flows as continuous variables. (In the earlier scheme, these variables have to be quantized into finitely many levels). The optimal control law is arrived at in the RL framework by making use of Q-learning strategy. Since the state variables are continuous, we propose the use of Radial Basis Function (RBF) neural networks to compute the Q-values for a given input state. Since, in this application we cannot provide training data appropriate for the standard supervised learning framework, a reinforcement learning algorithm is employed to train the RBF network. We also employ a novel exploration strategy, based on a Learning Automata algorithm,for generating training samples during Q-learning. The proposed scheme, in addition to being simple to implement, inherits all the attractive features of an RL scheme such as model independent design, flexibility in control objective specification, robustness etc. Two implementations of the proposed approach are presented. Through simulation studies the attractiveness of this approach is demonstrated.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider the problem of devising incentive strategies for viral marketing of a product. In particular, we assume that the seller can influence penetration of the product by offering two incentive programs: a) direct incentives to potential buyers (influence) and b) referral rewards for customers who influence potential buyers to make the purchase (exploit connections). The problem is to determine the optimal timing of these programs over a finite time horizon. In contrast to algorithmic perspective popular in the literature, we take a mean-field approach and formulate the problem as a continuous-time deterministic optimal control problem. We show that the optimal strategy for the seller has a simple structure and can take both forms, namely, influence-and-exploit and exploit-and-influence. We also show that in some cases it may optimal for the seller to deploy incentive programs mostly for low degree nodes. We support our theoretical results through numerical studies and provide practical insights by analyzing various scenarios.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an advanced single network adaptive critic (SNAC) aided nonlinear dynamic inversion (NDI) approach for simultaneous attitude control and trajectory tracking of a micro-quadrotor. Control of micro-quadrotors is a challenging problem due to its small size, strong coupling in pitch-yaw-roll and aerodynamic effects that often need to be ignored in the control design process to avoid mathematical complexities. In the proposed SNAC aided NDI approach, the gains of the dynamic inversion design are selected in such a way that the resulting controller behaves closely to a pre-synthesized SNAC controller for the output regulation problem. However, since SNAC is based on optimal control theory, it makes the dynamic inversion controller to operate near optimal and enhances its robustness property as well. More important, it retains two major benefits of dynamic inversion: (i) closed form expression of the controller and (ii) easy scalability to command tracking application even without any apriori knowledge of the reference command. Effectiveness of the proposed controller is demonstrated from six degree-of-freedom simulation studies of a micro-quadrotor. It has also been observed that the proposed SNAC aided NDI approach is more robust to modeling inaccuracies, as compared to the NDI controller designed independently from time domain specifications.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We study risk-sensitive control of continuous time Markov chains taking values in discrete state space. We study both finite and infinite horizon problems. In the finite horizon problem we characterize the value function via Hamilton Jacobi Bellman equation and obtain an optimal Markov control. We do the same for infinite horizon discounted cost case. In the infinite horizon average cost case we establish the existence of an optimal stationary control under certain Lyapunov condition. We also develop a policy iteration algorithm for finding an optimal control.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

To combine the advantages of both stability and optimality-based designs, a single network adaptive critic (SNAC) aided nonlinear dynamic inversion approach is presented in this paper. Here, the gains of a dynamic inversion controller are selected in such a way that the resulting controller behaves very close to a pre-synthesized SNAC controller in the output regulation sense. Because SNAC is based on optimal control theory, it makes the dynamic inversion controller operate nearly optimal. More important, it retains the two major benefits of dynamic inversion, namely (i) a closed-form expression of the controller and (ii) easy scalability to command tracking applications without knowing the reference commands a priori. An extended architecture is also presented in this paper that adapts online to system modeling and inversion errors, as well as reduced control effectiveness, thereby leading to enhanced robustness. The strengths of this hybrid method of applying SNAC to optimize an nonlinear dynamic inversion controller is demonstrated by considering a benchmark problem in robotics, that is, a two-link robotic manipulator system. Copyright (C) 2013 John Wiley & Sons, Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Optimal control of traffic lights at junctions or traffic signal control (TSC) is essential for reducing the average delay experienced by the road users amidst the rapid increase in the usage of vehicles. In this paper, we formulate the TSC problem as a discounted cost Markov decision process (MDP) and apply multi-agent reinforcement learning (MARL) algorithms to obtain dynamic TSC policies. We model each traffic signal junction as an independent agent. An agent decides the signal duration of its phases in a round-robin (RR) manner using multi-agent Q-learning with either is an element of-greedy or UCB 3] based exploration strategies. It updates its Q-factors based on the cost feedback signal received from its neighbouring agents. This feedback signal can be easily constructed and is shown to be effective in minimizing the average delay of the vehicles in the network. We show through simulations over VISSIM that our algorithms perform significantly better than both the standard fixed signal timing (FST) algorithm and the saturation balancing (SAT) algorithm 15] over two real road networks.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, several simplification methods are presented for shape control of repetitive structures such as symmetrical, rotational periodic, linear periodic, chain and axisymmetrical structures. Some special features in the differential equations governing these repetitive structures are examined by considering the whole structures. Based on the special properties of the governing equations, several methods are presented for simplifying their solution process. Finally, the static shape control of a cantilever symmetrical plate with piezoelectric actuator patches is demonstrated using the present simplification method. The result shows that present methods can effectively be used to find the optimal control voltage for shape control.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sequential Monte Carlo methods, also known as particle methods, are a widely used set of computational tools for inference in non-linear non-Gaussian state-space models. In many applications it may be necessary to compute the sensitivity, or derivative, of the optimal filter with respect to the static parameters of the state-space model; for instance, in order to obtain maximum likelihood model parameters of interest, or to compute the optimal controller in an optimal control problem. In Poyiadjis et al. [2011] an original particle algorithm to compute the filter derivative was proposed and it was shown using numerical examples that the particle estimate was numerically stable in the sense that it did not deteriorate over time. In this paper we substantiate this claim with a detailed theoretical study. Lp bounds and a central limit theorem for this particle approximation of the filter derivative are presented. It is further shown that under mixing conditions these Lp bounds and the asymptotic variance characterized by the central limit theorem are uniformly bounded with respect to the time index. We demon- strate the performance predicted by theory with several numerical examples. We also use the particle approximation of the filter derivative to perform online maximum likelihood parameter estimation for a stochastic volatility model.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis is motivated by safety-critical applications involving autonomous air, ground, and space vehicles carrying out complex tasks in uncertain and adversarial environments. We use temporal logic as a language to formally specify complex tasks and system properties. Temporal logic specifications generalize the classical notions of stability and reachability that are studied in the control and hybrid systems communities. Given a system model and a formal task specification, the goal is to automatically synthesize a control policy for the system that ensures that the system satisfies the specification. This thesis presents novel control policy synthesis algorithms for optimal and robust control of dynamical systems with temporal logic specifications. Furthermore, it introduces algorithms that are efficient and extend to high-dimensional dynamical systems.

The first contribution of this thesis is the generalization of a classical linear temporal logic (LTL) control synthesis approach to optimal and robust control. We show how we can extend automata-based synthesis techniques for discrete abstractions of dynamical systems to create optimal and robust controllers that are guaranteed to satisfy an LTL specification. Such optimal and robust controllers can be computed at little extra computational cost compared to computing a feasible controller.

The second contribution of this thesis addresses the scalability of control synthesis with LTL specifications. A major limitation of the standard automaton-based approach for control with LTL specifications is that the automaton might be doubly-exponential in the size of the LTL specification. We introduce a fragment of LTL for which one can compute feasible control policies in time polynomial in the size of the system and specification. Additionally, we show how to compute optimal control policies for a variety of cost functions, and identify interesting cases when this can be done in polynomial time. These techniques are particularly relevant for online control, as one can guarantee that a feasible solution can be found quickly, and then iteratively improve on the quality as time permits.

The final contribution of this thesis is a set of algorithms for computing feasible trajectories for high-dimensional, nonlinear systems with LTL specifications. These algorithms avoid a potentially computationally-expensive process of computing a discrete abstraction, and instead compute directly on the system's continuous state space. The first method uses an automaton representing the specification to directly encode a series of constrained-reachability subproblems, which can be solved in a modular fashion by using standard techniques. The second method encodes an LTL formula as mixed-integer linear programming constraints on the dynamical system. We demonstrate these approaches with numerical experiments on temporal logic motion planning problems with high-dimensional (10+ states) continuous systems.