964 resultados para Computational Simulation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study Krylov subspace methods for approximating the matrix-function vector product φ(tA)b where φ(z) = [exp(z) - 1]/z. This product arises in the numerical integration of large stiff systems of differential equations by the Exponential Euler Method, where A is the Jacobian matrix of the system. Recently, this method has found application in the simulation of transport phenomena in porous media within mathematical models of wood drying and groundwater flow. We develop an a posteriori upper bound on the Krylov subspace approximation error and provide a new interpretation of a previously published error estimate. This leads to an alternative Krylov approximation to φ(tA)b, the so-called Harmonic Ritz approximant, which we find does not exhibit oscillatory behaviour of the residual error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The delay stochastic simulation algorithm (DSSA) by Barrio et al. [Plos Comput. Biol.2, 117–E (2006)] was developed to simulate delayed processes in cell biology in the presence of intrinsic noise, that is, when there are small-to-moderate numbers of certain key molecules present in a chemical reaction system. These delayed processes can faithfully represent complex interactions and mechanisms that imply a number of spatiotemporal processes often not explicitly modeled such as transcription and translation, basic in the modeling of cell signaling pathways. However, for systems with widely varying reaction rate constants or large numbers of molecules, the simulation time steps of both the stochastic simulation algorithm (SSA) and the DSSA can become very small causing considerable computational overheads. In order to overcome the limit of small step sizes, various τ-leap strategies have been suggested for improving computational performance of the SSA. In this paper, we present a binomial τ- DSSA method that extends the τ-leap idea to the delay setting and avoids drawing insufficient numbers of reactions, a common shortcoming of existing binomial τ-leap methods that becomes evident when dealing with complex chemical interactions. The resulting inaccuracies are most evident in the delayed case, even when considering reaction products as potential reactants within the same time step in which they are produced. Moreover, we extend the framework to account for multicellular systems with different degrees of intercellular communication. We apply these ideas to two important genetic regulatory models, namely, the hes1 gene, implicated as a molecular clock, and a Her1/Her 7 model for coupled oscillating cells.