171 resultados para Stochastic convergence


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary. Peter L. Bartlett, Alexander Rakhlin

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maximum-likelihood estimates of the parameters of stochastic differential equations are consistent and asymptotically efficient, but unfortunately difficult to obtain if a closed-form expression for the transitional probability density function of the process is not available. As a result, a large number of competing estimation procedures have been proposed. This article provides a critical evaluation of the various estimation techniques. Special attention is given to the ease of implementation and comparative performance of the procedures when estimating the parameters of the Cox–Ingersoll–Ross and Ornstein–Uhlenbeck equations respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An initialisation process is a key component in modern stream cipher design. A well-designed initialisation process should ensure that each key-IV pair generates a different key stream. In this paper, we analyse two ciphers, A5/1 and Mixer, for which this does not happen due to state convergence. We show how the state convergence problem occurs and estimate the effective key-space in each case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates are applied to the convex dual of either the log-linear or max-margin objective function; the dual in both the log-linear and max-margin cases corresponds to minimizing a convex function with simplex constraints. We study both batch and online variants of the algorithm, and provide rates of convergence for both cases. In the max-margin case, O(1/ε) EG updates are required to reach a given accuracy ε in the dual; in contrast, for log-linear models only O(log(1/ε)) updates are required. For both the max-margin and log-linear cases, our bounds suggest that the online EG algorithm requires a factor of n less computation to reach a desired accuracy than the batch EG algorithm, where n is the number of training examples. Our experiments confirm that the online algorithms are much faster than the batch algorithms in practice. We describe how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing. We perform extensive evaluation of the algorithms, comparing them to L-BFGS and stochastic gradient descent for log-linear models, and to SVM-Struct for max-margin models. The algorithms are applied to a multi-class problem as well as to a more complex large-scale parsing task. In all these settings, the EG algorithms presented here outperform the other methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bistability arises within a wide range of biological systems from the λ phage switch in bacteria to cellular signal transduction pathways in mammalian cells. Changes in regulatory mechanisms may result in genetic switching in a bistable system. Recently, more and more experimental evidence in the form of bimodal population distributions indicates that noise plays a very important role in the switching of bistable systems. Although deterministic models have been used for studying the existence of bistability properties under various system conditions, these models cannot realize cell-to-cell fluctuations in genetic switching. However, there is a lag in the development of stochastic models for studying the impact of noise in bistable systems because of the lack of detailed knowledge of biochemical reactions, kinetic rates, and molecular numbers. In this work, we develop a previously undescribed general technique for developing quantitative stochastic models for large-scale genetic regulatory networks by introducing Poisson random variables into deterministic models described by ordinary differential equations. Two stochastic models have been proposed for the genetic toggle switch interfaced with either the SOS signaling pathway or a quorum-sensing signaling pathway, and we have successfully realized experimental results showing bimodal population distributions. Because the introduced stochastic models are based on widely used ordinary differential equation models, the success of this work suggests that this approach is a very promising one for studying noise in large-scale genetic regulatory networks.