168 resultados para Probabilistic Algorithms
Resumo:
We present four new reinforcement learning algorithms based on actor-critic, natural-gradient and functi approximation ideas,and we provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function-approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of special interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients. Our results extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
Resumo:
This article analyzes the effect of devising a new failure envelope by the combination of the most commonly used failure criteria for the composite laminates, on the design of composite structures. The failure criteria considered for the study are maximum stress and Tsai-Wu criteria. In addition to these popular phenomenological-based failure criteria, a micromechanics-based failure criterion called failure mechanism-based failure criterion is also considered. The failure envelopes obtained by these failure criteria are superimposed over one another and a new failure envelope is constructed based on the lowest absolute values of the strengths predicted by these failure criteria. Thus, the new failure envelope so obtained is named as most conservative failure envelope. A minimum weight design of composite laminates is performed using genetic algorithms. In addition to this, the effect of stacking sequence on the minimum weight of the laminate is also studied. Results are compared for the different failure envelopes and the conservative design is evaluated, with respect to the designs obtained by using only one failure criteria. The design approach is recommended for structures where composites are the key load-carrying members such as helicopter rotor blades.
Resumo:
A considerable amount of work has been dedicated on the development of analytical solutions for flow of chemical contaminants through soils. Most of the analytical solutions for complex transport problems are closed-form series solutions. The convergence of these solutions depends on the eigen values obtained from a corresponding transcendental equation. Thus, the difficulty in obtaining exact solutions from analytical models encourages the use of numerical solutions for the parameter estimation even though, the later models are computationally expensive. In this paper a combination of two swarm intelligence based algorithms are used for accurate estimation of design transport parameters from the closed-form analytical solutions. Estimation of eigen values from a transcendental equation is treated as a multimodal discontinuous function optimization problem. The eigen values are estimated using an algorithm derived based on glowworm swarm strategy. Parameter estimation of the inverse problem is handled using standard PSO algorithm. Integration of these two algorithms enables an accurate estimation of design parameters using closed-form analytical solutions. The present solver is applied to a real world inverse problem in environmental engineering. The inverse model based on swarm intelligence techniques is validated and the accuracy in parameter estimation is shown. The proposed solver quickly estimates the design parameters with a great precision.
Resumo:
A common trick for designing faster quantum adiabatic algorithms is to apply the adiabaticity condition locally at every instant. However it is often difficult to determine the instantaneous gap between the lowest two eigenvalues, which is an essential ingredient in the adiabaticity condition. In this paper we present a simple linear algebraic technique for obtaining a lower bound on the instantaneous gap even in such a situation. As an illustration, we investigate the adiabatic un-ordered search of van Dam et al. [17] and Roland and Cerf [15] when the non-zero entries of the diagonal final Hamiltonian are perturbed by a polynomial (in log N, where N is the length of the unordered list) amount. We use our technique to derive a bound on the running time of a local adiabatic schedule in terms of the minimum gap between the lowest two eigenvalues.
Resumo:
We consider the problem of determining if two finite groups are isomorphic. The groups are assumed to be represented by their multiplication tables. We present an O(n) algorithm that determines if two Abelian groups with n elements each are isomorphic. This improves upon the previous upper bound of O(n log n) [Narayan Vikas, An O(n) algorithm for Abelian p-group isomorphism and an O(n log n) algorithm for Abelian group isomorphism, J. Comput. System Sci. 53 (1996) 1-9] known for this problem. We solve a more general problem of computing the orders of all the elements of any group (not necessarily Abelian) of size n in O(n) time. Our algorithm for isomorphism testing of Abelian groups follows from this result. We use the property that our order finding algorithm works for any group to design a simple O(n) algorithm for testing whether a group of size n, described by its multiplication table, is nilpotent. We also give an O(n) algorithm for determining if a group of size n, described by its multiplication table, is Abelian. (C) 2007 Elsevier Inc. All rights reserved.
Resumo:
We propose certain discrete parameter variants of well known simulation optimization algorithms. Two of these algorithms are based on the smoothed functional (SF) technique while two others are based on the simultaneous perturbation stochastic approximation (SPSA) method. They differ from each other in the way perturbations are obtained and also the manner in which projections and parameter updates are performed. All our algorithms use two simulations and two-timescale stochastic approximation. As an application setting, we consider the important problem of admission control of packets in communication networks under dependent service times. We consider a discrete time slotted queueing model of the system and consider two different scenarios - one where the service times have a dependence on the system state and the other where they depend on the number of arrivals in a time slot. Under our settings, the simulated objective function appears ill-behaved with multiple local minima and a unique global minimum characterized by a sharp dip in the objective function in a small region of the parameter space. We compare the performance of our algorithms on these settings and observe that the two SF algorithms show the best results overall. In fact, in many cases studied, SF algorithms converge to the global minimum.
Resumo:
Four hybrid algorithms has been developed for the solution of the unit commitment problem. They use simulated annealing as one of the constituent techniques, and produce lower cost schedules; two of them have less overhead than other soft computing techniques. They are also more robust to the choice of parameters. A special technique avoids the generating of infeasible schedules, and thus reduces computation time.
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
Next generation wireless systems employ Orthogonal frequency division multiplexing (OFDM) physical layer owing to the high data rate transmissions that are possible without increase in bandwidth. While TCP performance has been extensively studied for interaction with link layer ARQ, little attention has been given to the interaction of TCP with MAC layer. In this work, we explore cross-layer interactions in an OFDM based wireless system, specifically focusing on channel-aware resource allocation strategies at the MAC layer and its impact on TCP congestion control. Both efficiency and fairness oriented MAC resource allocation strategies were designed for evaluating the performance of TCP. The former schemes try to exploit the channel diversity to maximize the system throughput, while the latter schemes try to provide a fair resource allocation over sufficiently long time duration. From a TCP goodput standpoint, we show that the class of MAC algorithms that incorporate a fairness metric and consider the backlog outperform the channel diversity exploiting schemes.
Resumo:
We consider the problem of computing an approximate minimum cycle basis of an undirected edge-weighted graph G with m edges and n vertices; the extension to directed graphs is also discussed. In this problem, a {0,1} incidence vector is associated with each cycle and the vector space over F-2 generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of the weights of the cycles is minimum is called a minimum cycle basis of G. Cycle bases of low weight are useful in a number of contexts, e.g. the analysis of electrical networks, structural engineering, chemistry, and surface reconstruction. We present two new algorithms to compute an approximate minimum cycle basis. For any integer k >= 1, we give (2k - 1)-approximation algorithms with expected running time 0(kmn(1+2/k) + mn((1+1/k)(omega-1))) and deterministic running time 0(n(3+2/k)), respectively. Here omega is the best exponent of matrix multiplication. It is presently known that omega < 2.376. Both algorithms are o(m(omega)) for dense graphs. This is the first time that any algorithm which computes sparse cycle bases with a guarantee drops below the Theta(m(omega)) bound. We also present a 2-approximation algorithm with O(m(omega) root n log n) expected running time, a linear time 2-approximation algorithm for planar graphs and an O(n(3)) time 2.42-approximation algorithm for the complete Euclidean graph in the plane.
Resumo:
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.
Resumo:
Relay selection for cooperative communications promises significant performance improvements, and is, therefore, attracting considerable attention. While several criteria have been proposed for selecting one or more relays, distributed mechanisms that perform the selection have received relatively less attention. In this paper, we develop a novel, yet simple, asymptotic analysis of a splitting-based multiple access selection algorithm to find the single best relay. The analysis leads to simpler and alternate expressions for the average number of slots required to find the best user. By introducing a new contention load' parameter, the analysis shows that the parameter settings used in the existing literature can be improved upon. New and simple bounds are also derived. Furthermore, we propose a new algorithm that addresses the general problem of selecting the best Q >= 1 relays, and analyze and optimize it. Even for a large number of relays, the scalable algorithm selects the best two relays within 4.406 slots and the best three within 6.491 slots, on average. We also propose a new and simple scheme for the practically relevant case of discrete metrics. Altogether, our results develop a unifying perspective about the general problem of distributed selection in cooperative systems and several other multi-node systems.
Resumo:
In this paper, we propose a self Adaptive Migration Model for Genetic Algorithms, where parameters of population size, the number of points of crossover and mutation rate for each population are fixed adaptively. Further, the migration of individuals between populations is decided dynamically. This paper gives a mathematical schema analysis of the method stating and showing that the algorithm exploits previously discovered knowledge for a more focused and concentrated search of heuristically high yielding regions while simultaneously performing a highly explorative search on the other regions of the search space. The effective performance of the algorithm is then shown using standard testbed functions, when compared with Island model GA(IGA) and Simple GA(SGA).
Resumo:
In this paper, we propose a self Adaptive Migration Model for Genetic Algorithms, where parameters of population size, the number of points of crossover and mutation rate for each population are fixed adaptively. Further, the migration of individuals between populations is decided dynamically. This paper gives a mathematical schema analysis of the method stating and showing that the algorithm exploits previously discovered knowledge for a more focused and concentrated search of heuristically high yielding regions while simultaneously performing a highly explorative search on the other regions of the search space. The effective performance of the algorithm is then shown using standard testbed functions, when compared with Island model GA(IGA) and Simple GA(SGA).
Resumo:
A laminated composite plate model based on first order shear deformation theory is implemented using the finite element method.Matrix cracks are introduced into the finite element model by considering changes in the A, B and D matrices of composites. The effects of different boundary conditions, laminate types and ply angles on the behavior of composite plates with matrix cracks are studied.Finally, the effect of material property uncertainty, which is important for composite material on the composite plate, is investigated using Monte Carlo simulations. Probabilistic estimates of damage detection reliability in composite plates are made for static and dynamic measurements. It is found that the effect of uncertainty must be considered for accurate damage detection in composite structures. The estimates of variance obtained for observable system properties due to uncertainty can be used for developing more robust damage detection algorithms. (C) 2010 Elsevier Ltd. All rights reserved.