34 resultados para Reinforcement Learning,Deep Neural Networks,Python,Stable Baseline,Gym


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to understand the development of non-genetically encoded actions during an animal's lifespan, it is necessary to analyze the dynamics and evolution of learning rules producing behavior. Owing to the intrinsic stochastic and frequency-dependent nature of learning dynamics, these rules are often studied in evolutionary biology via agent-based computer simulations. In this paper, we show that stochastic approximation theory can help to qualitatively understand learning dynamics and formulate analytical models for the evolution of learning rules. We consider a population of individuals repeatedly interacting during their lifespan, and where the stage game faced by the individuals fluctuates according to an environmental stochastic process. Individuals adjust their behavioral actions according to learning rules belonging to the class of experience-weighted attraction learning mechanisms, which includes standard reinforcement and Bayesian learning as special cases. We use stochastic approximation theory in order to derive differential equations governing action play probabilities, which turn out to have qualitative features of mutator-selection equations. We then perform agent-based simulations to find the conditions where the deterministic approximation is closest to the original stochastic learning process for standard 2-action 2-player fluctuating games, where interaction between learning rules and preference reversal may occur. Finally, we analyze a simplified model for the evolution of learning in a producer-scrounger game, which shows that the exploration rate can interact in a non-intuitive way with other features of co-evolving learning rules. Overall, our analyses illustrate the usefulness of applying stochastic approximation theory in the study of animal learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Increasing evidence suggests that working memory and perceptual processes are dynamically interrelated due to modulating activity in overlapping brain networks. However, the direct influence of working memory on the spatio-temporal brain dynamics of behaviorally relevant intervening information remains unclear. To investigate this issue, subjects performed a visual proximity grid perception task under three different visual-spatial working memory (VSWM) load conditions. VSWM load was manipulated by asking subjects to memorize the spatial locations of 6 or 3 disks. The grid was always presented between the encoding and recognition of the disk pattern. As a baseline condition, grid stimuli were presented without a VSWM context. VSWM load altered both perceptual performance and neural networks active during intervening grid encoding. Participants performed faster and more accurately on a challenging perceptual task under high VSWM load as compared to the low load and the baseline condition. Visual evoked potential (VEP) analyses identified changes in the configuration of the underlying sources in one particular period occurring 160-190 ms post-stimulus onset. Source analyses further showed an occipito-parietal down-regulation concurrent to the increased involvement of temporal and frontal resources in the high VSWM context. Together, these data suggest that cognitive control mechanisms supporting working memory may selectively enhance concurrent visual processing related to an independent goal. More broadly, our findings are in line with theoretical models implicating the engagement of frontal regions in synchronizing and optimizing mnemonic and perceptual resources towards multiple goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This report synthesizes the findings of 11 country reports on policy learning in labour market and social policies that were conducted as part of WP5 of the INSPIRES project, which is funded by the 7th Framework Program of the EU-Commission. Notably, this report puts forward objectives of policy learning, discusses tools, processes and institutions of policy learning and presents the impacts of various tools and structures of the policy learning infrastructure for the actual policy learning process. The report defines three objectives of policy learning: evaluation and assessment of policy effectiveness, vision building and planning, and consensus building. In the 11 countries under consideration, the tools and processes of the policy learning, infrastructure can be classified into three broad groups: public bodies, expert councils, and parties, interest groups and the private sector. Finally, we develop four recommendations for policy learning: Firstly, learning processes should keep the balance between centralisation and plurality. Secondly, learning processes should be kept stable beyond the usual political business cycles. Thirdly, policy learning tools and infrastructures should be sufficiently independent from political influence or bias. Fourth, Policy learning tools and infrastructures should balance out mere effectiveness, evaluation and vision building.