977 resultados para Learning dynamics
Resumo:
In order to understand the development of non-genetically encoded actions during an animal's lifespan, it is necessary to analyze the dynamics and evolution of learning rules producing behavior. Owing to the intrinsic stochastic and frequency-dependent nature of learning dynamics, these rules are often studied in evolutionary biology via agent-based computer simulations. In this paper, we show that stochastic approximation theory can help to qualitatively understand learning dynamics and formulate analytical models for the evolution of learning rules. We consider a population of individuals repeatedly interacting during their lifespan, and where the stage game faced by the individuals fluctuates according to an environmental stochastic process. Individuals adjust their behavioral actions according to learning rules belonging to the class of experience-weighted attraction learning mechanisms, which includes standard reinforcement and Bayesian learning as special cases. We use stochastic approximation theory in order to derive differential equations governing action play probabilities, which turn out to have qualitative features of mutator-selection equations. We then perform agent-based simulations to find the conditions where the deterministic approximation is closest to the original stochastic learning process for standard 2-action 2-player fluctuating games, where interaction between learning rules and preference reversal may occur. Finally, we analyze a simplified model for the evolution of learning in a producer-scrounger game, which shows that the exploration rate can interact in a non-intuitive way with other features of co-evolving learning rules. Overall, our analyses illustrate the usefulness of applying stochastic approximation theory in the study of animal learning.
Resumo:
The learning properties of a universal approximator, a normalized committee machine with adjustable biases, are studied for on-line back-propagation learning. Within a statistical mechanics framework, numerical studies show that this model has features which do not exist in previously studied two-layer network models without adjustable biases, e.g., attractive suboptimal symmetric phases even for realizable cases and noiseless data.
Resumo:
In a local production system (LPS), besides external economies, the interaction, cooperation, and learning are indicated by the literature as complementary ways of enhancing the LPS's competitiveness and gains. In Brazil, the greater part of LPSs, mostly composed by small enterprises, displays incipient relationships and low levels of interaction and cooperation among their actors. The size of the participating enterprises itself for specificities that engender organizational constraints, which, in turn, can have a considerable impact on their relationships and learning dynamics. For that reason, it is the purpose of this article to present an analysis of interaction, cooperation, and learning relationships among several types of actors pertaining to an LPS in the farming equipment and machinery sector, bearing in mind the specificities of small enterprises. To this end, the fieldwork carried out in this study aimed at: (i) investigating external and internal knowledge sources conducive to learning and (ii) identifying and analyzing motivating and inhibiting factors related to specificities of small enterprises in order to bring the LPS members closer together and increase their cooperation and interaction. Empirical evidence shows that internal aspects of the enterprises, related to management and infrastructure, can have a strong bearing on their joint actions, interaction and learning processes.
Resumo:
Agents have two forecasting models, one consistent with the unique rational expectations equilibrium, another that assumes a time-varying parameter structure. When agents use Bayesian updating to choose between models in a self-referential system, we find that learning dynamics lead to selection of one of the two models. However, there are parameter regions for which the non-rational forecasting model is selected in the long-run. A key structural parameter governing outcomes measures the degree of expectations feedback in Muth's model of price determination.
Resumo:
In this work I study the stability of the dynamics generated by adaptivelearning processes in intertemporal economies with lagged variables. Iprove that determinacy of the steady state is a necessary condition for the convergence of the learning dynamics and I show that the reciprocal is not true characterizing the economies where convergence holds. In the case of existence of cycles I show that there is not, in general, a relationship between determinacy and convergence of the learning process to the cycle. I also analyze the expectational stability of these equilibria.
Resumo:
This paper is concerned with the realism of mechanisms that implementsocial choice functions in the traditional sense. Will agents actually playthe equilibrium assumed by the analysis? As an example, we study theconvergence and stability properties of Sj\"ostr\"om's (1994) mechanism, onthe assumption that boundedly rational players find their way to equilibriumusing monotonic learning dynamics and also with fictitious play. Thismechanism implements most social choice functions in economic environmentsusing as a solution concept the iterated elimination of weakly dominatedstrategies (only one round of deletion of weakly dominated strategies isneeded). There are, however, many sets of Nash equilibria whose payoffs maybe very different from those desired by the social choice function. Withmonotonic dynamics we show that many equilibria in all the sets ofequilibria we describe are the limit points of trajectories that havecompletely mixed initial conditions. The initial conditions that lead tothese equilibria need not be very close to the limiting point. Furthermore,even if the dynamics converge to the ``right'' set of equilibria, it stillcan converge to quite a poor outcome in welfare terms. With fictitious play,if the agents have completely mixed prior beliefs, beliefs and play convergeto the outcome the planner wants to implement.
Resumo:
When individuals learn by trial-and-error, they perform randomly chosen actions and then reinforce those actions that led to a high payoff. However, individuals do not always have to physically perform an action in order to evaluate its consequences. Rather, they may be able to mentally simulate actions and their consequences without actually performing them. Such fictitious learners can select actions with high payoffs without making long chains of trial-and-error learning. Here, we analyze the evolution of an n-dimensional cultural trait (or artifact) by learning, in a payoff landscape with a single optimum. We derive the stochastic learning dynamics of the distance to the optimum in trait space when choice between alternative artifacts follows the standard logit choice rule. We show that for both trial-and-error and fictitious learners, the learning dynamics stabilize at an approximate distance of root n/(2 lambda(e)) away from the optimum, where lambda(e) is an effective learning performance parameter depending on the learning rule under scrutiny. Individual learners are thus unlikely to reach the optimum when traits are complex (n large), and so face a barrier to further improvement of the artifact. We show, however, that this barrier can be significantly reduced in a large population of learners performing payoff-biased social learning, in which case lambda(e) becomes proportional to population size. Overall, our results illustrate the effects of errors in learning, levels of cognition, and population size for the evolution of complex cultural traits. (C) 2013 Elsevier Inc. All rights reserved.
Resumo:
Many species are able to learn to associate behaviours with rewards as this gives fitness advantages in changing environments. Social interactions between population members may, however, require more cognitive abilities than simple trial-and-error learning, in particular the capacity to make accurate hypotheses about the material payoff consequences of alternative action combinations. It is unclear in this context whether natural selection necessarily favours individuals to use information about payoffs associated with nontried actions (hypothetical payoffs), as opposed to simple reinforcement of realized payoff. Here, we develop an evolutionary model in which individuals are genetically determined to use either trial-and-error learning or learning based on hypothetical reinforcements, and ask what is the evolutionarily stable learning rule under pairwise symmetric two-action stochastic repeated games played over the individual's lifetime. We analyse through stochastic approximation theory and simulations the learning dynamics on the behavioural timescale, and derive conditions where trial-and-error learning outcompetes hypothetical reinforcement learning on the evolutionary timescale. This occurs in particular under repeated cooperative interactions with the same partner. By contrast, we find that hypothetical reinforcement learners tend to be favoured under random interactions, but stable polymorphisms can also obtain where trial-and-error learners are maintained at a low frequency. We conclude that specific game structures can select for trial-and-error learning even in the absence of costs of cognition, which illustrates that cost-free increased cognition can be counterselected under social interactions.
Resumo:
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Resumo:
The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.
Resumo:
The dynamics of on-line learning is investigated for structurally unrealizable tasks in the context of two-layer neural networks with an arbitrary number of hidden neurons. Within a statistical mechanics framework, a closed set of differential equations describing the learning dynamics can be derived, for the general case of unrealizable isotropic tasks. In the asymptotic regime one can solve the dynamics analytically in the limit of large number of hidden neurons, providing an analytical expression for the residual generalization error, the optimal and critical asymptotic training parameters, and the corresponding prefactor of the generalization error decay.
Resumo:
We study the dynamics of on-line learning in multilayer neural networks where training examples are sampled with repetition and where the number of examples scales with the number of network weights. The analysis is carried out using the dynamical replica method aimed at obtaining a closed set of coupled equations for a set of macroscopic variables from which both training and generalization errors can be calculated. We focus on scenarios whereby training examples are corrupted by additive Gaussian output noise and regularizers are introduced to improve the network performance. The dependence of the dynamics on the noise level, with and without regularizers, is examined, as well as that of the asymptotic values obtained for both training and generalization errors. We also demonstrate the ability of the method to approximate the learning dynamics in structurally unrealizable scenarios. The theoretical results show good agreement with those obtained by computer simulations.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
The control of movement is predicated upon a system of constraints of musculoskeletal and neural origin. The focus of the present study was upon the manner in which such constraints are adapted or superseded during the acquisition of motor skill. Individuals participated in five experimental sessions, ill which they attempted to produce abduction-adduction movements of the index finger in time with an auditory metronome. During each trial, the metronome frequency was increased in eight steps from an individually determined base frequency. Electromyographic (EMC) activity was recorded from first dorsal interosseous (FDI), first volar interosseous (FVI), flexor digitorum superficialis (FDS), and extensor digitorum communis (EDC) muscles. The movements produced on the final day of acquisition more accurately matched the required profile, and exhibited greater spatial and temporal stability, than those generated during initial performance. Tn the early stages of skill acquisition, an alternating pattern of activation in FDI and FVI was maintained, even at the highest frequencies. Tn contrast, as the frequency of movement was increased, activity in FDS and EDC was either tonic or intermittent. As learning proceeded, alterations in recruitment patterns were expressed primarily in the extrinsic muscles (EDC and FDS). These changes took the form of increases in the postural role of these muscles, shifts to phasic patterns of activation, or selective disengagement of these muscles. These findings suggest that there is considerable flexibility in the composition of muscle synergies, which is exploited by individuals during the acquisition of coordination.
Resumo:
This paper considers the Ricardian Equivalence proposition when expectations are not rational and are instead formed using adaptive learning rules. We show that Ricardian Equivalence continues to hold provided suitable additional conditions on learning dynamics are satisfied. However, new cases of failure can also emerge under learning. In particular, for Ricardian Equivalence to obtain, agents’ expectations must not depend on government’s financial variables under deficit financing.