281 resultados para Nature inspired algorithms
Resumo:
We present four new reinforcement learning algorithms based on actor-critic, natural-gradient and functi approximation ideas,and we provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function-approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of special interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients. Our results extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
Resumo:
This article analyzes the effect of devising a new failure envelope by the combination of the most commonly used failure criteria for the composite laminates, on the design of composite structures. The failure criteria considered for the study are maximum stress and Tsai-Wu criteria. In addition to these popular phenomenological-based failure criteria, a micromechanics-based failure criterion called failure mechanism-based failure criterion is also considered. The failure envelopes obtained by these failure criteria are superimposed over one another and a new failure envelope is constructed based on the lowest absolute values of the strengths predicted by these failure criteria. Thus, the new failure envelope so obtained is named as most conservative failure envelope. A minimum weight design of composite laminates is performed using genetic algorithms. In addition to this, the effect of stacking sequence on the minimum weight of the laminate is also studied. Results are compared for the different failure envelopes and the conservative design is evaluated, with respect to the designs obtained by using only one failure criteria. The design approach is recommended for structures where composites are the key load-carrying members such as helicopter rotor blades.
Resumo:
A considerable amount of work has been dedicated on the development of analytical solutions for flow of chemical contaminants through soils. Most of the analytical solutions for complex transport problems are closed-form series solutions. The convergence of these solutions depends on the eigen values obtained from a corresponding transcendental equation. Thus, the difficulty in obtaining exact solutions from analytical models encourages the use of numerical solutions for the parameter estimation even though, the later models are computationally expensive. In this paper a combination of two swarm intelligence based algorithms are used for accurate estimation of design transport parameters from the closed-form analytical solutions. Estimation of eigen values from a transcendental equation is treated as a multimodal discontinuous function optimization problem. The eigen values are estimated using an algorithm derived based on glowworm swarm strategy. Parameter estimation of the inverse problem is handled using standard PSO algorithm. Integration of these two algorithms enables an accurate estimation of design parameters using closed-form analytical solutions. The present solver is applied to a real world inverse problem in environmental engineering. The inverse model based on swarm intelligence techniques is validated and the accuracy in parameter estimation is shown. The proposed solver quickly estimates the design parameters with a great precision.
Resumo:
Biological systems present remarkable adaptation, reliability, and robustness in various environments, even under hostility. Most of them are controlled by the individuals in a distributed and self-organized way. These biological mechanisms provide useful resources for designing the dynamical and adaptive routing schemes of wireless mobile sensor networks, in which the individual nodes should ideally operate without central control. This paper investigates crucial biologically inspired mechanisms and the associated techniques for resolving routing in wireless sensor networks, including Ant-based and genetic approaches. Furthermore, the principal contributions of this paper are as follows. We present a mathematical theory of the biological computations in the context of sensor networks; we further present a generalized routing framework in sensor networks by diffusing different modes of biological computations using Ant-based and genetic approaches; finally, an overview of several emerging research directions are addressed within the new biologically computational framework.
Resumo:
A common trick for designing faster quantum adiabatic algorithms is to apply the adiabaticity condition locally at every instant. However it is often difficult to determine the instantaneous gap between the lowest two eigenvalues, which is an essential ingredient in the adiabaticity condition. In this paper we present a simple linear algebraic technique for obtaining a lower bound on the instantaneous gap even in such a situation. As an illustration, we investigate the adiabatic un-ordered search of van Dam et al. [17] and Roland and Cerf [15] when the non-zero entries of the diagonal final Hamiltonian are perturbed by a polynomial (in log N, where N is the length of the unordered list) amount. We use our technique to derive a bound on the running time of a local adiabatic schedule in terms of the minimum gap between the lowest two eigenvalues.
Resumo:
We consider the problem of determining if two finite groups are isomorphic. The groups are assumed to be represented by their multiplication tables. We present an O(n) algorithm that determines if two Abelian groups with n elements each are isomorphic. This improves upon the previous upper bound of O(n log n) [Narayan Vikas, An O(n) algorithm for Abelian p-group isomorphism and an O(n log n) algorithm for Abelian group isomorphism, J. Comput. System Sci. 53 (1996) 1-9] known for this problem. We solve a more general problem of computing the orders of all the elements of any group (not necessarily Abelian) of size n in O(n) time. Our algorithm for isomorphism testing of Abelian groups follows from this result. We use the property that our order finding algorithm works for any group to design a simple O(n) algorithm for testing whether a group of size n, described by its multiplication table, is nilpotent. We also give an O(n) algorithm for determining if a group of size n, described by its multiplication table, is Abelian. (C) 2007 Elsevier Inc. All rights reserved.
Resumo:
We propose certain discrete parameter variants of well known simulation optimization algorithms. Two of these algorithms are based on the smoothed functional (SF) technique while two others are based on the simultaneous perturbation stochastic approximation (SPSA) method. They differ from each other in the way perturbations are obtained and also the manner in which projections and parameter updates are performed. All our algorithms use two simulations and two-timescale stochastic approximation. As an application setting, we consider the important problem of admission control of packets in communication networks under dependent service times. We consider a discrete time slotted queueing model of the system and consider two different scenarios - one where the service times have a dependence on the system state and the other where they depend on the number of arrivals in a time slot. Under our settings, the simulated objective function appears ill-behaved with multiple local minima and a unique global minimum characterized by a sharp dip in the objective function in a small region of the parameter space. We compare the performance of our algorithms on these settings and observe that the two SF algorithms show the best results overall. In fact, in many cases studied, SF algorithms converge to the global minimum.
Resumo:
Four hybrid algorithms has been developed for the solution of the unit commitment problem. They use simulated annealing as one of the constituent techniques, and produce lower cost schedules; two of them have less overhead than other soft computing techniques. They are also more robust to the choice of parameters. A special technique avoids the generating of infeasible schedules, and thus reduces computation time.
Resumo:
We investigate the dynamics of peeling of an adhesive tape subjected to a constant pull speed. Due to the constraint between the pull force, peel angle and the peel force, the equations of motion derived earlier fall into the category of differential-algebraic equations (DAE) requiring an appropriate algorithm for its numerical solution. By including the kinetic energy arising from the stretched part of the tape in the Lagrangian, we derive equations of motion that support stick-slip jumps as a natural consequence of the inherent dynamics itself, thus circumventing the need to use any special algorithm. In the low mass limit, these equations reproduce solutions obtained using a differential-algebraic algorithm introduced for the earlier singular equations. We find that mass has a strong influence on the dynamics of the model rendering periodic solutions to chaotic and vice versa. Apart from the rich dynamics, the model reproduces several qualitative features of the different waveforms of the peel force function as also the decreasing nature of force drop magnitudes.
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
Next generation wireless systems employ Orthogonal frequency division multiplexing (OFDM) physical layer owing to the high data rate transmissions that are possible without increase in bandwidth. While TCP performance has been extensively studied for interaction with link layer ARQ, little attention has been given to the interaction of TCP with MAC layer. In this work, we explore cross-layer interactions in an OFDM based wireless system, specifically focusing on channel-aware resource allocation strategies at the MAC layer and its impact on TCP congestion control. Both efficiency and fairness oriented MAC resource allocation strategies were designed for evaluating the performance of TCP. The former schemes try to exploit the channel diversity to maximize the system throughput, while the latter schemes try to provide a fair resource allocation over sufficiently long time duration. From a TCP goodput standpoint, we show that the class of MAC algorithms that incorporate a fairness metric and consider the backlog outperform the channel diversity exploiting schemes.
Resumo:
We consider the problem of computing an approximate minimum cycle basis of an undirected edge-weighted graph G with m edges and n vertices; the extension to directed graphs is also discussed. In this problem, a {0,1} incidence vector is associated with each cycle and the vector space over F-2 generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of the weights of the cycles is minimum is called a minimum cycle basis of G. Cycle bases of low weight are useful in a number of contexts, e.g. the analysis of electrical networks, structural engineering, chemistry, and surface reconstruction. We present two new algorithms to compute an approximate minimum cycle basis. For any integer k >= 1, we give (2k - 1)-approximation algorithms with expected running time 0(kmn(1+2/k) + mn((1+1/k)(omega-1))) and deterministic running time 0(n(3+2/k)), respectively. Here omega is the best exponent of matrix multiplication. It is presently known that omega < 2.376. Both algorithms are o(m(omega)) for dense graphs. This is the first time that any algorithm which computes sparse cycle bases with a guarantee drops below the Theta(m(omega)) bound. We also present a 2-approximation algorithm with O(m(omega) root n log n) expected running time, a linear time 2-approximation algorithm for planar graphs and an O(n(3)) time 2.42-approximation algorithm for the complete Euclidean graph in the plane.
Resumo:
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.
Resumo:
One influential image that is popular among scientists is the view that mathematics is the language of nature. The present article discusses another possible way to approach the relation between mathematics and nature, which is by using the idea of information and the conceptual vocabulary of cryptography. This approach allows us to understand the possibility that secrets of nature need not be written in mathematics and yet mathematics is necessary as a cryptographic key to unlock these secrets. Various advantages of such a view are described in this article.
Resumo:
Relay selection for cooperative communications promises significant performance improvements, and is, therefore, attracting considerable attention. While several criteria have been proposed for selecting one or more relays, distributed mechanisms that perform the selection have received relatively less attention. In this paper, we develop a novel, yet simple, asymptotic analysis of a splitting-based multiple access selection algorithm to find the single best relay. The analysis leads to simpler and alternate expressions for the average number of slots required to find the best user. By introducing a new contention load' parameter, the analysis shows that the parameter settings used in the existing literature can be improved upon. New and simple bounds are also derived. Furthermore, we propose a new algorithm that addresses the general problem of selecting the best Q >= 1 relays, and analyze and optimize it. Even for a large number of relays, the scalable algorithm selects the best two relays within 4.406 slots and the best three within 6.491 slots, on average. We also propose a new and simple scheme for the practically relevant case of discrete metrics. Altogether, our results develop a unifying perspective about the general problem of distributed selection in cooperative systems and several other multi-node systems.