469 resultados para Macchina automatica avvolgipallet
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
It is shown that a sufficient condition for the asymptotic stability-in-the-large of an autonomous system containing a linear part with transfer function G(jω) and a non-linearity belonging to a class of power-law non-linearities with slope restriction [0, K] in cascade in a negative feedback loop is ReZ(jω)[G(jω) + 1 K] ≥ 0 for all ω where the multiplier is given by, Z(jω) = 1 + αjω + Y(jω) - Y(-jω) with a real, y(t) = 0 for t < 0 and ∫ 0 ∞ |y(t)|dt < 1 2c2, c2 being a constant associated with the class of non-linearity. Any allowable multiplier can be converted to the above form and this form leads to lesser restrictions on the parameters in many cases. Criteria for the case of odd monotonic non-linearities and of linear gains are obtained as limiting cases of the criterion developed. A striking feature of the present result is that in the linear case it reduces to the necessary and sufficient conditions corresponding to the Nyquist criterion. An inequality of the type |R(T) - R(- T)| ≤ 2c2R(0) where R(T) is the input-output cross-correlation function of the non-linearity, is used in deriving the results.
Resumo:
A linear state feedback gain vector used in the control of a single input dynamical system may be constrained because of the way feedback is realized. Some examples of feedback realizations which impose constraints on the gain vector are: static output feedback, constant gain feedback for several operating points of a system, and two-controller feedback. We consider a general class of problems of stabilization of single input dynamical systems with such structural constraints and give a numerical method to solve them. Each of these problems is cast into a problem of solving a system of equalities and inequalities. In this formulation, the coefficients of the quadratic and linear factors of the closed-loop characteristic polynomial are the variables. To solve the system of equalities and inequalities, a continuous realization of the gradient projection method and a barrier method are used under the homotopy framework. Our method is illustrated with an example for each class of control structure constraint.
Resumo:
A strongly connected decentralized control system may be made single channel controllable and observable with respect to any channel by decentralized feedbacks. It is noted here that the system example considered by Corfmat and Morse to illustrate this fact is already single channel controllable and observable, with respect to one of the channels. An alternate example which fits into the situation is presented in this item.
Resumo:
Unlike zero-sum stochastic games, a difficult problem in general-sum stochastic games is to obtain verifiable conditions for Nash equilibria. We show in this paper that by splitting an associated non-linear optimization problem into several sub-problems, characterization of Nash equilibria in a general-sum discounted stochastic games is possible. Using the aforementioned sub-problems, we in fact derive a set of necessary and sufficient verifiable conditions (termed KKT-SP conditions) for a strategy-pair to result in Nash equilibrium. Also, we show that any algorithm which tracks the zero of the gradient of the Lagrangian of every sub-problem provides a Nash strategy-pair. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
We present the first q-Gaussian smoothed functional (SF) estimator of the Hessian and the first Newton-based stochastic optimization algorithm that estimates both the Hessian and the gradient of the objective function using q-Gaussian perturbations. Our algorithm requires only two system simulations (regardless of the parameter dimension) and estimates both the gradient and the Hessian at each update epoch using these. We also present a proof of convergence of the proposed algorithm. In a related recent work (Ghoshdastidar, Dukkipati, & Bhatnagar, 2014), we presented gradient SF algorithms based on the q-Gaussian perturbations. Our work extends prior work on SF algorithms by generalizing the class of perturbation distributions as most distributions reported in the literature for which SF algorithms are known to work turn out to be special cases of the q-Gaussian distribution. Besides studying the convergence properties of our algorithm analytically, we also show the results of numerical simulations on a model of a queuing network, that illustrate the significance of the proposed method. In particular, we observe that our algorithm performs better in most cases, over a wide range of q-values, in comparison to Newton SF algorithms with the Gaussian and Cauchy perturbations, as well as the gradient q-Gaussian SF algorithms. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
A new 1D NMR experiment cited as `Quick G-SERF', which re-introduces selective proton-proton scalar interactions in a pure shift spectrum during real time data acquisition, is reported. The method provides information on multiple proton-proton couplings from a single experiment, analogous to the 2D G-SERF technique, while significantly shortening the experimental time by 1-2 orders of magnitude due to reduced dimension and enhanced sensitivity.
Resumo:
The sensor scheduling problem can be formulated as a controlled hidden Markov model and this paper solves the problem when the state, observation and action spaces are continuous. This general case is important as it is the natural framework for many applications. The aim is to minimise the variance of the estimation error of the hidden state w.r.t. the action sequence. We present a novel simulation-based method that uses a stochastic gradient algorithm to find optimal actions. © 2007 Elsevier Ltd. All rights reserved.
Resumo:
Nivel educativo: Grado. Duración (en horas): Más de 50 horas