328 resultados para Stochastic Approximation Algorithms
Resumo:
The random early detection (RED) technique has seen a lot of research over the years. However, the functional relationship between RED performance and its parameters viz,, queue weight (omega(q)), marking probability (max(p)), minimum threshold (min(th)) and maximum threshold (max(th)) is not analytically availa ble. In this paper, we formulate a probabilistic constrained optimization problem by assuming a nonlinear relationship between the RED average queue length and its parameters. This problem involves all the RED parameters as the variables of the optimization problem. We use the barrier and the penalty function approaches for its Solution. However (as above), the exact functional relationship between the barrier and penalty objective functions and the optimization variable is not known, but noisy samples of these are available for different parameter values. Thus, for obtaining the gradient and Hessian of the objective, we use certain recently developed simultaneous perturbation stochastic approximation (SPSA) based estimates of these. We propose two four-timescale stochastic approximation algorithms based oil certain modified second-order SPSA updates for finding the optimum RED parameters. We present the results of detailed simulation experiments conducted over different network topologies and network/traffic conditions/settings, comparing the performance of Our algorithms with variants of RED and a few other well known adaptive queue management (AQM) techniques discussed in the literature.
Resumo:
The problem of estimating the time-dependent statistical characteristics of a random dynamical system is studied under two different settings. In the first, the system dynamics is governed by a differential equation parameterized by a random parameter, while in the second, this is governed by a differential equation with an underlying parameter sequence characterized by a continuous time Markov chain. We propose, for the first time in the literature, stochastic approximation algorithms for estimating various time-dependent process characteristics of the system. In particular, we provide efficient estimators for quantities such as the mean, variance and distribution of the process at any given time as well as the joint distribution and the autocorrelation coefficient at different times. A novel aspect of our approach is that we assume that information on the parameter model (i.e., its distribution in the first case and transition probabilities of the Markov chain in the second) is not available in either case. This is unlike most other work in the literature that assumes availability of such information. Also, most of the prior work in the literature is geared towards analyzing the steady-state system behavior of the random dynamical system while our focus is on analyzing the time-dependent statistical characteristics which are in general difficult to obtain. We prove the almost sure convergence of our stochastic approximation scheme in each case to the true value of the quantity being estimated. We provide a general class of strongly consistent estimators for the aforementioned statistical quantities with regular sample average estimators being a specific instance of these. We also present an application of the proposed scheme on a widely used model in population biology. Numerical experiments in this framework show that the time-dependent process characteristics as obtained using our algorithm in each case exhibit excellent agreement with exact results. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
We present two efficient discrete parameter simulation optimization (DPSO) algorithms for the long-run average cost objective. One of these algorithms uses the smoothed functional approximation (SFA) procedure, while the other is based on simultaneous perturbation stochastic approximation (SPSA). The use of SFA for DPSO had not been proposed previously in the literature. Further, both algorithms adopt an interesting technique of random projections that we present here for the first time. We give a proof of convergence of our algorithms. Next, we present detailed numerical experiments on a problem of admission control with dependent service times. We consider two different settings involving parameter sets that have moderate and large sizes, respectively. On the first setting, we also show performance comparisons with the well-studied optimal computing budget allocation (OCBA) algorithm and also the equal allocation algorithm. Note to Practitioners-Even though SPSA and SFA have been devised in the literature for continuous optimization problems, our results indicate that they can be powerful techniques even when they are adapted to discrete optimization settings. OCBA is widely recognized as one of the most powerful methods for discrete optimization when the parameter sets are of small or moderate size. On a setting involving a parameter set of size 100, we observe that when the computing budget is small, both SPSA and OCBA show similar performance and are better in comparison to SFA, however, as the computing budget is increased, SPSA and SFA show better performance than OCBA. Both our algorithms also show good performance when the parameter set has a size of 10(8). SFA is seen to show the best overall performance. Unlike most other DPSO algorithms in the literature, an advantage with our algorithms is that they are easily implementable regardless of the size of the parameter sets and show good performance in both scenarios.
Resumo:
We propose certain discrete parameter variants of well known simulation optimization algorithms. Two of these algorithms are based on the smoothed functional (SF) technique while two others are based on the simultaneous perturbation stochastic approximation (SPSA) method. They differ from each other in the way perturbations are obtained and also the manner in which projections and parameter updates are performed. All our algorithms use two simulations and two-timescale stochastic approximation. As an application setting, we consider the important problem of admission control of packets in communication networks under dependent service times. We consider a discrete time slotted queueing model of the system and consider two different scenarios - one where the service times have a dependence on the system state and the other where they depend on the number of arrivals in a time slot. Under our settings, the simulated objective function appears ill-behaved with multiple local minima and a unique global minimum characterized by a sharp dip in the objective function in a small region of the parameter space. We compare the performance of our algorithms on these settings and observe that the two SF algorithms show the best results overall. In fact, in many cases studied, SF algorithms converge to the global minimum.
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.
Resumo:
Four algorithms, all variants of Simultaneous Perturbation Stochastic Approximation (SPSA), are proposed. The original one-measurement SPSA uses an estimate of the gradient of objective function L containing an additional bias term not seen in two-measurement SPSA. As a result, the asymptotic covariance matrix of the iterate convergence process has a bias term. We propose a one-measurement algorithm that eliminates this bias, and has asymptotic convergence properties making for easier comparison with the two-measurement SPSA. The algorithm, under certain conditions, outperforms both forms of SPSA with the only overhead being the storage of a single measurement. We also propose a similar algorithm that uses perturbations obtained from normalized Hadamard matrices. The convergence w.p. 1 of both algorithms is established. We extend measurement reuse to design two second-order SPSA algorithms and sketch the convergence analysis. Finally, we present simulation results on an illustrative minimization problem.
Resumo:
For a class of distributed recursive algorithms, it is shown that a stochastic approximation-like tapering stepsize routine suppresses the effects of interprocessor delays.
Resumo:
Smoothed functional (SF) schemes for gradient estimation are known to be efficient in stochastic optimization algorithms, especially when the objective is to improve the performance of a stochastic system However, the performance of these methods depends on several parameters, such as the choice of a suitable smoothing kernel. Different kernels have been studied in the literature, which include Gaussian, Cauchy, and uniform distributions, among others. This article studies a new class of kernels based on the q-Gaussian distribution, which has gained popularity in statistical physics over the last decade. Though the importance of this family of distributions is attributed to its ability to generalize the Gaussian distribution, we observe that this class encompasses almost all existing smoothing kernels. This motivates us to study SF schemes for gradient estimation using the q-Gaussian distribution. Using the derived gradient estimates, we propose two-timescale algorithms for optimization of a stochastic objective function in a constrained setting with a projected gradient search approach. We prove the convergence of our algorithms to the set of stationary points of an associated ODE. We also demonstrate their performance numerically through simulations on a queuing model.
Resumo:
The maximum independent set problem is NP-complete even when restricted to planar graphs, cubic planar graphs or triangle free graphs. The problem of finding an absolute approximation still remains NP-complete. Various polynomial time approximation algorithms, that guarantee a fixed worst case ratio between the independent set size obtained to the maximum independent set size, in planar graphs have been proposed. We present in this paper a simple and efficient, O(|V|) algorithm that guarantees a ratio 1/2, for planar triangle free graphs. The algorithm differs completely from other approaches, in that, it collects groups of independent vertices at a time. Certain bounds we obtain in this paper relate to some interesting questions in the theory of extremal graphs.
Resumo:
Bluetooth is a short-range radio technology operating in the unlicensed industrial-scientific-medical (ISM) band at 2.45 GHz. A piconet is basically a collection of slaves controlled by a master. A scatternet, on the other hand, is established by linking several piconets together in an ad hoc fashion to yield a global wireless ad hoc network. This paper proposes a scheduling policy that aims to achieve increased system throughput and reduced packet delays while providing reasonably good fairness among all traffic flows in bluetooth piconets and scatternets. We propose a novel algorithm for scheduling slots to slaves for both piconets and scatternets using multi-layered parameterized policies. Our scheduling scheme works with real data and obtains an optimal feedback policy within prescribed parameterized classes of these by using an efficient two-timescale simultaneous perturbation stochastic approximation (SPSA) algorithm. We show the convergence of our algorithm to an optimal multi-layered policy. We also propose novel polling schemes for intra- and inter-piconet scheduling that are seen to perform well. We present an extensive set of simulation results and performance comparisons with existing scheduling algorithms. Our results indicate that our proposed scheduling algorithm performs better overall on a wide range of experiments over the existing algorithms for both piconets (Das et al. in INFOCOM, pp. 591–600, 2001; Lapeyrie and Turletti in INFOCOM conference proceedings, San Francisco, US, 2003; Shreedhar and Varghese in SIGCOMM, pp. 231–242, 1995) and scatternets (Har-Shai et al. in OPNETWORK, 2002; Saha and Matsumot in AICT/ICIW, 2006; Tan and Guttag in The 27th annual IEEE conference on local computer networks(LCN). Tampa, 2002). Our studies also confirm that our proposed scheme achieves a high throughput and low packet delays with reasonable fairness among all the connections.
Resumo:
In this paper, we study the behaviour of the slotted Aloha multiple access scheme with a finite number of users under different traffic loads and optimize the retransmission probability q(r) for various settings, cost objectives and policies. First, we formulate the problem as a parameter optimization problem and use certain efficient smoothed functional algorithms for finding the optimal retransmission probability parameter. Next, we propose two classes of multi-level closed-loop feedback policies (for finding in each case the retransmission probability qr that now depends on the current system state) and apply the above algorithms for finding an optimal policy within each class of policies. While one of the policy classes depends on the number of backlogged nodes in the system, the other depends on the number of time slots since the last successful transmission. The latter policies are more realistic as it is difficult to keep track of the number of backlogged nodes at each instant. We investigate the effect of increasing the number of levels in the feedback policies. Wen also investigate the effects of using different cost functions (withn and without penalization) in our algorithms and the corresponding change in the throughput and delay using these. Both of our algorithms use two-timescale stochastic approximation. One of the algorithms uses one simulation while the other uses two simulations of the system. The two-simulation algorithm is seen to perform better than the other algorithm. Optimal multi-level closed-loop policies are seen to perform better than optimal open-loop policies. The performance further improves when more levels are used in the feedback policies.
Resumo:
The overall performance of random early detection (RED) routers in the Internet is determined by the settings of their associated parameters. The non-availability of a functional relationship between the RED performance and its parameters makes it difficult to implement optimization techniques directly in order to optimize the RED parameters. In this paper, we formulate a generic optimization framework using a stochastically bounded delay metric to dynamically adapt the RED parameters. The constrained optimization problem thus formulated is solved using traditional nonlinear programming techniques. Here, we implement the barrier and penalty function approaches, respectively. We adopt a second-order nonlinear optimization framework and propose a novel four-timescale stochastic approximation algorithm to estimate the gradient and Hessian of the barrier and penalty objectives and update the RED parameters. A convergence analysis of the proposed algorithm is briefly sketched. We perform simulations to evaluate the performance of our algorithm with both barrier and penalty objectives and compare these with RED and a variant of it in the literature. We observe an improvement in performance using our proposed algorithm over RED, and the above variant of it.
Resumo:
The author presents adaptive control techniques for controlling the flow of real-time jobs from the peripheral processors (PPs) to the central processor (CP) of a distributed system with a star topology. He considers two classes of flow control mechanisms: (1) proportional control, where a certain proportion of the load offered to each PP is sent to the CP, and (2) threshold control, where there is a maximum rate at which each PP can send jobs to the CP. The problem is to obtain good algorithms for dynamically adjusting the control level at each PP in order to prevent overload of the CP, when the load offered by the PPs is unknown and varying. The author formulates the problem approximately as a standard system control problem in which the system has unknown parameters that are subject to change. Using well-known techniques (e.g., naive-feedback-controller and stochastic approximation techniques), he derives adaptive controls for the system control problem. He demonstrates the efficacy of these controls in the original problem by using the control algorithms in simulations of a queuing model of the CP and the load controls.
Resumo:
The k-colouring problem is to colour a given k-colourable graph with k colours. This problem is known to be NP-hard even for fixed k greater than or equal to 3. The best known polynomial time approximation algorithms require n(delta) (for a positive constant delta depending on k) colours to colour an arbitrary k-colourable n-vertex graph. The situation is entirely different if we look at the average performance of an algorithm rather than its worst-case performance. It is well known that a k-colourable graph drawn from certain classes of distributions can be ii-coloured almost surely in polynomial time. In this paper, we present further results in this direction. We consider k-colourable graphs drawn from the random model in which each allowed edge is chosen independently with probability p(n) after initially partitioning the vertex set into ii colour classes. We present polynomial time algorithms of two different types. The first type of algorithm always runs in polynomial time and succeeds almost surely. Algorithms of this type have been proposed before, but our algorithms have provably exponentially small failure probabilities. The second type of algorithm always succeeds and has polynomial running time on average. Such algorithms are more useful and more difficult to obtain than the first type of algorithms. Our algorithms work as long as p(n) greater than or equal to n(-1+is an element of) where is an element of is a constant greater than 1/4.