2 resultados para stochastic stability

em Massachusetts Institute of Technology


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the field of biologics production, productivity and stability of the transfected gene of interest are two very important attributes that dictate if a production process is viable. To further understand and improve these two traits, we would need to further our understanding of the factors affecting them. These would include integration site of the gene, gene copy number, cell phenotypic variation and cell environment. As these factors play different parts in the development process, they lead to variable productivity and stability of the transfected gene between clones, the well-known phenomenon of “clonal variation”. A study of this phenomenon and how the various factors contribute to it will thus shed light on strategies to improve productivity and stability in the production cell line. Of the four factors, the site of gene integration appears to be one of the most important. Hence, it is proposed that work is done on studying how different integration sites affect the productivity and stability of transfected genes in the development process. For the study to be more industrially relevant, it is proposed that the Chinese Hamster Ovary dhfr-deficient cell line, CHO-DG44, is used as the model system.