988 resultados para 10-1
Resumo:
We propose a randomized algorithm for large scale SVM learning which solves the problem by iterating over random subsets of the data. Crucial to the algorithm for scalability is the size of the subsets chosen. In the context of text classification we show that, by using ideas from random projections, a sample size of O(log n) can be used to obtain a solution which is close to the optimal with a high probability. Experiments done on synthetic and real life data sets demonstrate that the algorithm scales up SVM learners, without loss in accuracy. 1
Resumo:
We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values of states with actions chosen according to the ‘current ’ randomized policy updates. A sketch of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms for routing on different network topologies are presented and performance comparisons with the regular Q-learning algorithm are shown.
Resumo:
In this paper, we address a key problem faced by advertisers in sponsored search auctions on the web: how much to bid, given the bids of the other advertisers, so as to maximize individual payoffs? Assuming the generalized second price auction as the auction mechanism, we formulate this problem in the framework of an infinite horizon alternative-move game of advertiser bidding behavior. For a sponsored search auction involving two advertisers, we characterize all the pure strategy and mixed strategy Nash equilibria. We also prove that the bid prices will lead to a Nash equilibrium, if the advertisers follow a myopic best response bidding strategy. Following this, we investigate the bidding behavior of the advertisers if they use Q-learning. We discover empirically an interesting trend that the Q-values converge even if both the advertisers learn simultaneously.
Resumo:
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic rein- forcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their com- patibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further re- duce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal differ- ence learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
INTACTE: An Interconnect Area, Delay, and Energy Estimation Tool for Microarchitectural Explorations
Resumo:
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters. Hence these models are hard to use directly to make high level microarchitectural trade-offs in the initial exploration phase of a design. In this paper, we propose INTACTE, a tool that can be used by architects toget reasonably accurate interconnect area, delay, and power estimates based on a few architecture level parameters for the interconnect such as length, width (in number of bits), frequency, and latency for a specified technology and voltage. The tool uses well known models of interconnect delay and energy taking into account the wire pitch, repeater size, and spacing for a range of voltages and technologies.It then solves an optimization problem of finding the lowest energy interconnect design in terms of the low level circuit parameters, which meets the architectural constraintsgiven as inputs. In addition, the tool also provides the area, energy, and delay for a range of supply voltages and degrees of pipelining, which can be used for micro-architectural exploration of a chip. The delay and energy models used by the tool have been validated against low level circuit simulations. We discuss several potential applications of the tool and present an example of optimizing interconnect design in the context of clustered VLIW architectures. Copyright 2007 ACM.
Resumo:
Abstract. Let G = (V,E) be a weighted undirected graph, with non-negative edge weights. We consider the problem of efficiently computing approximate distances between all pairs of vertices in G. While many efficient algorithms are known for this problem in unweighted graphs, not many results are known for this problem in weighted graphs. Zwick [14] showed that for any fixed ε> 0, stretch 1 1 + ε distances between all pairs of vertices in a weighted directed graph on n vertices can be computed in Õ(n ω) time, where ω < 2.376 is the exponent of matrix multiplication and n is the number of vertices. It is known that finding distances of stretch less than 2 between all pairs of vertices in G is at least as hard as Boolean matrix multiplication of two n×n matrices. It is also known that all-pairs stretch 3 distances can be computed in Õ(n 2) time and all-pairs stretch 7/3 distances can be computed in Õ(n 7/3) time. Here we consider efficient algorithms for the problem of computing all-pairs stretch (2+ε) distances in G, for any 0 < ε < 1. We show that all pairs stretch (2 + ε) distances for any fixed ε> 0 in G can be computed in expected time O(n 9/4 logn). This algorithm uses a fast rectangular matrix multiplication subroutine. We also present a combinatorial algorithm (that is, it does not use fast matrix multiplication) with expected running time O(n 9/4) for computing all-pairs stretch 5/2 distances in G. 1
Resumo:
Abstract—A new breed of processors like the Cell Broadband Engine, the Imagine stream processor and the various GPU processors emphasize data-level parallelism (DLP) and threadlevel parallelism (TLP) as opposed to traditional instructionlevel parallelism (ILP). This allows them to achieve order-ofmagnitude improvements over conventional superscalar processors for many workloads. However, it is unclear as to how much parallelism of these types exists in current programs. Most earlier studies have largely concentrated on the amount of ILP in a program, without differentiating DLP or TLP. In this study, we investigate the extent of data-level parallelism available in programs in the MediaBench suite. By packing instructions in a SIMD fashion, we observe reductions of up to 91 % (84 % on average) in the number of dynamic instructions, indicating a very high degree of DLP in several applications. I.
Resumo:
This paper addresses the problem of multiagent search in an unknown environment. The agents are autonomous in nature and are equipped with necessary sensors to carry out the search operation. The uncertainty, or lack of information about the search area is known a priori as a probability density function. The agents are deployed in an optimal way so as to maximize the one step uncertainty reduction. The agents continue to deploy themselves and reduce uncertainty till the uncertainty density is reduced over the search space below a minimum acceptable level. It has been shown, using LaSalle’s invariance principle, that a distributed control law which moves each of the agents towards the centroid of its Voronoi partition, modified by the sensor range leads to single step optimal deployment. This principle is now used to devise search trajectories for the agents. The simulations were carried out in 2D space with saturation on speeds of the agents. The results show that the control strategy per step indeed moves the agents to the respective centroid and the algorithm reduces the uncertainty distribution to the required level within a few steps.
Resumo:
Abstract—This document introduces a new kinematic simulation of a wheeled mobile robot operating on uneven terrain. Our modeling method borrows concepts from dextrous manipulation. This allows for an accurate simulation of the way 3-dimensional wheels roll over a smooth ground surface. The purpose of the simulation is to validate a new concept for design of off-road wheel suspensions, called Passive Variable Camber (PVC). We show that PVC eliminates kinematic slip for an outdoor robot. Both forward and inverse kinematics are discussed and simulation results are presented.
Resumo:
We study the problem of optimal bandwidth allocation in communication networks. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length at the buffers is observed every T instants of time, on the basis of which a decision on the amount of bandwidth to be allocated to each buffer for the next T instants is made. We consider a class of closed-loop feedback policies for the system and use a twotimescale simultaneous perturbation stochastic approximation(SPSA) algorithm to find an optimal policy within the prescribed class. We study the performance of the proposed algorithm on a numerical setting. Our algorithm is found to exhibit good performance.
Resumo:
We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.
Resumo:
The problem of finding optimal parameterized feedback policies for dynamic bandwidth allocation in communication networks is studied. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length at the buffers is observed every T instants of time, on the basis of which a decision on the amount of bandwidth to be allocated to each buffer for the next T instants is made. We consider two different classes of multilevel closed-loop feedback policies for the system and use a two-timescale simultaneous perturbation stochastic approximation (SPSA) algorithm to find optimal policies within each prescribed class. We study the performance of the proposed algorithm on a numerical setting and show performance comparisons of the two optimal multilevel closedloop policies with optimal open loop policies. We observe that closed loop policies of Class B that tune parameters for both the queues and do not have the constraint that the entire bandwidth be used at each instant exhibit the best results overall as they offer greater flexibility in parameter tuning. Index Terms — Resource allocation, dynamic bandwidth allocation in communication networks, two-timescale SPSA algorithm, optimal parameterized policies. I.
Resumo:
The variation of equilibrium oxygen potential with oxygen concentration inYBa 2Cu3O7-δhas been measured in the temperature range of 773 to 1223 K. For temperatures up to 1073 K, the oxygen content of theYBa 2Cu3O7-δsample, held in a stabilized-zirconia crucible, was altered by coulometric titration. The compound was in contact with the electrolyte, permitting direct exchange of oxygen ions. For measurements above 1073 K, the oxide was contained in a magnesia crucible placed inside a closed silica tube. The oxygen potential in the gas phase above the 123 compound was controlled and measured by a solid-state cell based on yttria-stabilized zirconia, which served both as a pump and sensor. Pure oxygen at a pressure of 1.01 × 105 Pa was used as the reference electrode. The oxygen pressure over the sample was varied from 10-1 to 105 Pa. The oxygen concentrations of the sample equilibrated with pure oxygen at 1.01 × 105 Pa at different temperatures were determined after quenching in liquid nitrogen by hydrogen reduction at 1223 K. The plot of chemical potential of oxygen as a function of oxygen non-stoichiometry shows an inflexion at δ ∼ 0.375 at 873 K. Data at 773 K indicate tendency for phase separation at lower temperatures. The partial enthalpy and entropy of oxygen derived from the temperature dependence of electromotive force (emf ) exhibit variation with composition. The partial enthalpy for °= 0.3, 0.4, and 0.5 also appears to be temperature dependent. The results are discussed in comparison with the data reported in the literature. An expression for the integral free energy of formation of YBa2Cu3O6.5 is evaluated based on measurements reported in the literature. By integration of the partial Gibbs’ energy of oxygen obtained in this study, the variation of integral property with oxygen concentration is obtained at 873 K.
Resumo:
In order to identify the dominant mechanism of ionic conduction, the electrical conductivity and ionic mobility of the glasses (AgX)0.4(Ag2O)0.3(GeO2)0.3 (X = I, Br, Cl) were measured separately in the temperature range from 293 to 393 K by coupling the AC technique with the TIC method. Electronic conductivity was also measured at 293 K by the Wagner polarization method. The total electrical conductivity of these glasses was found to be as high as 10-1 Ω-1 m-1, and the mobility about 10-6 m2 V-1 s-1. The variation of total electrical conductivity and mobility at constant temperature and composition with the type of halide occurred in the sequence, Cl < Br < I. For each composition, both conductivity and mobility increased with temperature. The mobile ion concentration was found to be about 1023 m-3 at 293 K, and it was insensitive to the type of halide as well as temperature. The results suggest that the change in ionic conductivity with the temperature and the type of halide present is mainly attributable to the change in ionic mobility rather than carrier concentration. Moreover, the electronic conductivity was found to be about 10-6 Ω-1 m-1 at 293 K. Thus, the electronic contribution to the total conductivity is negligibly small.
Resumo:
We consider the problem of computing a minimum cycle basis in a directed graph G. The input to this problem is a directed graph whose arcs have positive weights. In this problem a {- 1, 0, 1} incidence vector is associated with each cycle and the vector space over Q generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of weights of the cycles is minimum is called a minimum cycle basis of G. The current fastest algorithm for computing a minimum cycle basis in a directed graph with m arcs and n vertices runs in O(m(w+1)n) time (where w < 2.376 is the exponent of matrix multiplication). If one allows randomization, then an (O) over tilde (m(3)n) algorithm is known for this problem. In this paper we present a simple (O) over tilde (m(2)n) randomized algorithm for this problem. The problem of computing a minimum cycle basis in an undirected graph has been well-studied. In this problem a {0, 1} incidence vector is associated with each cycle and the vector space over F-2 generated by these vectors is the cycle space of the graph. The fastest known algorithm for computing a minimum cycle basis in an undirected graph runs in O(m(2)n + mn(2) logn) time and our randomized algorithm for directed graphs almost matches this running time.