14 resultados para Macchina automatica avvolgipallet
em Indian Institute of Science - Bangalore - Índia
Resumo:
Given a plant P, we consider the problem of designing a pair of controllers C1 and C2 such that their sum stabilizes P, and in addition, each of them also stabilizes P should the other one fail. This is referred to as the reliable stabilization problem. It is shown that every strongly stabilizable plant can be reliably stabilized; moreover, one of the two controllers can be specified arbitrarily, subject only to the constraint that it should be stable. The stabilization technique is extended to reliable regulation.
Resumo:
A cooperative game played in a sequential manner by a pair of learning automata is investigated in this paper. The automata operate in an unknown random environment which gives a common pay-off to the automata. Necessary and sufficient conditions on the functions in the reinforcement scheme are given for absolute monotonicity which enables the expected pay-off to be monotonically increasing in any arbitrary environment. As each participating automaton operates with no information regarding the other partner, the results of the paper are relevant to decentralized control.
Resumo:
A strongly connected decentralized control system may be made single channel controllable and observable with respect to any channel by decentralized feedbacks. It is noted here that the system example considered by Corfmat and Morse to illustrate this fact is already single channel controllable and observable, with respect to one of the channels. An alternate example which fits into the situation is presented in this item.
Resumo:
By deriving the equations for an error analysis of modeling inaccuracies for the combined estimation and control problem, it is shown that the optimum estimation error is orthogonal to the actual suboptimum estimate.
Resumo:
By deriving the equations for an error analysis of modeling inaccuracies for the combined estimation and control problem, it is shown that the optimum estimation error is orthogonal to the actual suboptimum estimate.
Resumo:
By deriving the equations for an error analysis of modeling inaccuracies for the combined estimation and control problem, it is shown that the optimum estimation error is orthogonal to the actual suboptimum estimate.
Resumo:
We present four new reinforcement learning algorithms based on actor-critic, natural-gradient and functi approximation ideas,and we provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function-approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of special interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients. Our results extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
It is shown that a sufficient condition for the asymptotic stability-in-the-large of an autonomous system containing a linear part with transfer function G(jω) and a non-linearity belonging to a class of power-law non-linearities with slope restriction [0, K] in cascade in a negative feedback loop is ReZ(jω)[G(jω) + 1 K] ≥ 0 for all ω where the multiplier is given by, Z(jω) = 1 + αjω + Y(jω) - Y(-jω) with a real, y(t) = 0 for t < 0 and ∫ 0 ∞ |y(t)|dt < 1 2c2, c2 being a constant associated with the class of non-linearity. Any allowable multiplier can be converted to the above form and this form leads to lesser restrictions on the parameters in many cases. Criteria for the case of odd monotonic non-linearities and of linear gains are obtained as limiting cases of the criterion developed. A striking feature of the present result is that in the linear case it reduces to the necessary and sufficient conditions corresponding to the Nyquist criterion. An inequality of the type |R(T) - R(- T)| ≤ 2c2R(0) where R(T) is the input-output cross-correlation function of the non-linearity, is used in deriving the results.
Resumo:
A linear state feedback gain vector used in the control of a single input dynamical system may be constrained because of the way feedback is realized. Some examples of feedback realizations which impose constraints on the gain vector are: static output feedback, constant gain feedback for several operating points of a system, and two-controller feedback. We consider a general class of problems of stabilization of single input dynamical systems with such structural constraints and give a numerical method to solve them. Each of these problems is cast into a problem of solving a system of equalities and inequalities. In this formulation, the coefficients of the quadratic and linear factors of the closed-loop characteristic polynomial are the variables. To solve the system of equalities and inequalities, a continuous realization of the gradient projection method and a barrier method are used under the homotopy framework. Our method is illustrated with an example for each class of control structure constraint.
Resumo:
A strongly connected decentralized control system may be made single channel controllable and observable with respect to any channel by decentralized feedbacks. It is noted here that the system example considered by Corfmat and Morse to illustrate this fact is already single channel controllable and observable, with respect to one of the channels. An alternate example which fits into the situation is presented in this item.
Resumo:
Unlike zero-sum stochastic games, a difficult problem in general-sum stochastic games is to obtain verifiable conditions for Nash equilibria. We show in this paper that by splitting an associated non-linear optimization problem into several sub-problems, characterization of Nash equilibria in a general-sum discounted stochastic games is possible. Using the aforementioned sub-problems, we in fact derive a set of necessary and sufficient verifiable conditions (termed KKT-SP conditions) for a strategy-pair to result in Nash equilibrium. Also, we show that any algorithm which tracks the zero of the gradient of the Lagrangian of every sub-problem provides a Nash strategy-pair. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
We present the first q-Gaussian smoothed functional (SF) estimator of the Hessian and the first Newton-based stochastic optimization algorithm that estimates both the Hessian and the gradient of the objective function using q-Gaussian perturbations. Our algorithm requires only two system simulations (regardless of the parameter dimension) and estimates both the gradient and the Hessian at each update epoch using these. We also present a proof of convergence of the proposed algorithm. In a related recent work (Ghoshdastidar, Dukkipati, & Bhatnagar, 2014), we presented gradient SF algorithms based on the q-Gaussian perturbations. Our work extends prior work on SF algorithms by generalizing the class of perturbation distributions as most distributions reported in the literature for which SF algorithms are known to work turn out to be special cases of the q-Gaussian distribution. Besides studying the convergence properties of our algorithm analytically, we also show the results of numerical simulations on a model of a queuing network, that illustrate the significance of the proposed method. In particular, we observe that our algorithm performs better in most cases, over a wide range of q-values, in comparison to Newton SF algorithms with the Gaussian and Cauchy perturbations, as well as the gradient q-Gaussian SF algorithms. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
A new 1D NMR experiment cited as `Quick G-SERF', which re-introduces selective proton-proton scalar interactions in a pure shift spectrum during real time data acquisition, is reported. The method provides information on multiple proton-proton couplings from a single experiment, analogous to the 2D G-SERF technique, while significantly shortening the experimental time by 1-2 orders of magnitude due to reduced dimension and enhanced sensitivity.