Biblioteca Digital

16 resultados para Learning of improvisation

Spike-based decision learning of nash equilibria in two-player games

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Humans and animals face decision tasks in an uncertain multi-agent environment where an agent's strategy may change in time due to the co-adaptation of others strategies. The neuronal substrate and the computational algorithms underlying such adaptive decision making, however, is largely unknown. We propose a population coding model of spiking neurons with a policy gradient procedure that successfully acquires optimal strategies for classical game-theoretical tasks. The suggested population reinforcement learning reproduces data from human behavioral experiments for the blackjack and the inspector game. It performs optimally according to a pure (deterministic) and mixed (stochastic) Nash equilibrium, respectively. In contrast, temporal-difference(TD)-learning, covariance-learning, and basic reinforcement learning fail to perform optimally for the stochastic strategy. Spike-based population reinforcement learning, shown to follow the stochastic reward gradient, is therefore a viable candidate to explain automated decision learning of a Nash equilibrium in two-player games.

Graph matching - Exact and error-tolerant methods and the automatic learning of edit costs

Relevância:

100.00% 100.00%

Publicador:

Automatic learning of cost functions for graph edit distance

Relevância:

100.00% 100.00%

Publicador:

Interference during the implicit learning of two different motor sequences

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It has been demonstrated that learning a second motor task after having learned a first task may interfere with the long-term consolidation of the first task. However, little is known about immediate changes in the representation of the motor memory in the early acquisition phase within the first minutes of the learning process. Therefore, we investigated such early interference effects with an implicit serial reaction time task in 55 healthy subjects. Each subject performed either a sequence learning task involving two different sequences, or a random control task. The results showed that learning the first sequence led to only a slight, short-lived interference effect in the early acquisition phase of the second sequence. Overall, learning of neither sequence was impaired. Furthermore, the two processes, sequence-unrelated task learning (i.e. general motor training) and the sequence learning itself did not appear to interfere with each other. In conclusion, although the long-term consolidation of a motor memory has been shown to be sensitive to other interfering memories, the present study suggests that the brain is initially able to acquire more than one new motor sequence within a short space of time without significant interference.

Perceptual learning of motion discrimination by mental imagery.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perceptual learning can occur when stimuli are only imagined, i.e., without proper stimulus presentation. For example, perceptual learning improved bisection discrimination when only the two outer lines of the bisection stimulus were presented and the central line had to be imagined. Performance improved also with other static stimuli. In non-learning imagery experiments, imagining static stimuli is different from imagining motion stimuli. We hypothesized that those differences also affect imagery perceptual learning. Here, we show that imagery training also improves motion direction discrimination. Learning occurs when no stimulus at all is presented during training, whereas no learning occurs when only noise is presented. The interference between noise and mental imagery possibly hinders learning. For static bisection stimuli, the pattern is just the opposite. Learning occurs when presented with the two outer lines of the bisection stimulus, i.e., with only a part of the visual stimulus, while no learning occurs when no stimulus at all is presented.

The early learning of failure cures in elementary school children

Relevância:

100.00% 100.00%

Publicador:

Governing individual learning in the transition phase of software maintenance offshoring: a dynamic perspective

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prior studies suggest that clients need to actively govern knowledge transfer to vendor staff in offshore outsourcing. In this paper, we analyze longitudinal data from four software maintenance offshore out-sourcing projects to explore why governance may be needed for knowledge transfer and how governance and the individual learning of vendor engineers inter-act over time. Our results suggest that self-control is central to learning, but may be hampered by low levels of trust and expertise at the outset of projects. For these foundations to develop, clients initially need to exert high amounts of formal and clan controls to enforce learning activities against barriers to knowledge sharing. Once learning activities occur, trust and expertise increase and control portfolios may show greater emphases on self-control.

Spatio-temporal credit assignment in population learning

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Learning by reinforcement is important in shaping animal behavior, and in particular in behavioral decision making. Such decision making is likely to involve the integration of many synaptic events in space and time. However, using a single reinforcement signal to modulate synaptic plasticity, as suggested in classical reinforcement learning algorithms, a twofold problem arises. Different synapses will have contributed differently to the behavioral decision, and even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike-time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward, but also by a population feedback signal. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference (TD) based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task, the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second task involves an action sequence which is itself extended in time and reward is only delivered at the last action, as it is the case in any type of board-game. The third task is the inspection game that has been studied in neuroeconomics, where an inspector tries to prevent a worker from shirking. Applying our algorithm to this game yields a learning behavior which is consistent with behavioral data from humans and monkeys, revealing themselves properties of a mixed Nash equilibrium. The examples show that our neuronal implementation of reward based learning copes with delayed and stochastic reward delivery, and also with the learning of mixed strategies in two-opponent games.

Spatio-temporal credit assignment in population learning

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.

Reinforcement learning in dendritic structures

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The discovery of binary dendritic events such as local NMDA spikes in dendritic subbranches led to the suggestion that dendritic trees could be computationally equivalent to a 2-layer network of point neurons, with a single output unit represented by the soma, and input units represented by the dendritic branches. Although this interpretation endows a neuron with a high computational power, it is functionally not clear why nature would have preferred the dendritic solution with a single but complex neuron, as opposed to the network solution with many but simple units. We show that the dendritic solution has a distinguished advantage over the network solution when considering different learning tasks. Its key property is that the dendritic branches receive an immediate feedback from the somatic output spike, while in the corresponding network architecture the feedback would require additional backpropagating connections to the input units. Assuming a reinforcement learning scenario we formally derive a learning rule for the synaptic contacts on the individual dendritic trees which depends on the presynaptic activity, the local NMDA spikes, the somatic action potential, and a delayed reinforcement signal. We test the model for two scenarios: the learning of binary classifications and of precise spike timings. We show that the immediate feedback represented by the backpropagating action potential supplies the individual dendritic branches with enough information to efficiently adapt their synapses and to speed up the learning process.

Reinforcement learning in dendritic structures

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The discovery of binary dendritic events such as local NMDA spikes in dendritic subbranches led to the suggestion that dendritic trees could be computationally equivalent to a 2-layer network of point neurons, with a single output unit represented by the soma, and input units represented by the dendritic branches. Although this interpretation endows a neuron with a high computational power, it is functionally not clear why nature would have preferred the dendritic solution with a single but complex neuron, as opposed to the network solution with many but simple units. We show that the dendritic solution has a distinguished advantage over the network solution when considering different learning tasks. Its key property is that the dendritic branches receive an immediate feedback from the somatic output spike, while in the corresponding network architecture the feedback would require additional backpropagating connections to the input units. Assuming a reinforcement learning scenario we formally derive a learning rule for the synaptic contacts on the individual dendritic trees which depends on the presynaptic activity, the local NMDA spikes, the somatic action potential, and a delayed reinforcement signal. We test the model for two scenarios: the learning of binary classifications and of precise spike timings. We show that the immediate feedback represented by the backpropagating action potential supplies the individual dendritic branches with enough information to efficiently adapt their synapses and to speed up the learning process.

Retraining of interjoint arm coordination after stroke using robot-assisted time-independent functional training

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We have developed a haptic-based approach for retraining of interjoint coordination following stroke called time-independent functional training (TIFT) and implemented this mode in the ARMin III robotic exoskeleton. The ARMin III robot was developed by Drs. Robert Riener and Tobias Nef at the Swiss Federal Institute of Technology Zurich (Eidgenossische Technische Hochschule Zurich, or ETH Zurich), in Zurich, Switzerland. In the TIFT mode, the robot maintains arm movements within the proper kinematic trajectory via haptic walls at each joint. These arm movements focus training of interjoint coordination with highly intuitive real-time feedback of performance; arm movements advance within the trajectory only if their movement coordination is correct. In initial testing, 37 nondisabled subjects received a single session of learning of a complex pattern. Subjects were randomized to TIFT or visual demonstration or moved along with the robot as it moved though the pattern (time-dependent [TD] training). We examined visual demonstration to separate the effects of action observation on motor learning from the effects of the two haptic guidance methods. During these training trials, TIFT subjects reduced error and interaction forces between the robot and arm, while TD subject performance did not change. All groups showed significant learning of the trajectory during unassisted recall trials, but we observed no difference in learning between groups, possibly because this learning task is dominated by vision. Further testing in stroke populations is warranted.

Different Learning Curves for Axillary Brachial Plexus Block: Ultrasound Guidance versus Nerve Stimulation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Little is known about the learning of the skills needed to perform ultrasound- or nerve stimulator-guided peripheral nerve blocks. The aim of this study was to compare the learning curves of residents trained in ultrasound guidance versus residents trained in nerve stimulation for axillary brachial plexus block. Ten residents with no previous experience with using ultrasound received ultrasound training and another ten residents with no previous experience with using nerve stimulation received nerve stimulation training. The novices' learning curves were generated by retrospective data analysis out of our electronic anaesthesia database. Individual success rates were pooled, and the institutional learning curve was calculated using a bootstrapping technique in combination with a Monte Carlo simulation procedure. The skills required to perform successful ultrasound-guided axillary brachial plexus block can be learnt faster and lead to a higher final success rate compared to nerve stimulator-guided axillary brachial plexus block.

Connectionism as metaphor: Towards an integrated, unified conception of self-system and individual difference

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Comments on an article by Kashima et al. (see record 2007-10111-001). In their target article Kashima and colleagues try to show how a connectionist model conceptualization of the self is best suited to capture the self's temporal and socio-culturally contextualized nature. They propose a new model and to support this model, the authors conduct computer simulations of psychological phenomena whose importance for the self has long been clear, even if not formally modeled, such as imitation, and learning of sequence and narrative. As explicated when we advocated connectionist models as a metaphor for self in Mischel and Morf (2003), we fully endorse the utility of such a metaphor, as these models have some of the processing characteristics necessary for capturing key aspects and functions of a dynamic cognitive-affective self-system. As elaborated in that chapter, we see as their principal strength that connectionist models can take account of multiple simultaneous processes without invoking a single central control. All outputs reflect a distributed pattern of activation across a large number of simple processing units, the nature of which depends on (and changes with) the connection weights between the links and the satisfaction of mutual constraints across these links (Rummelhart & McClelland, 1986). This allows a simple account for why certain input features will at times predominate, while others take over on other occasions. (PsycINFO Database Record (c) 2008 APA, all rights reserved)

Probabilistic contingency learning with limbic or prefrontal damage

Relevância:

90.00% 90.00%

Publicador:

Resumo:

AB A fundamental capacity of the human brain is to learn relations (contingencies) between environmental stimuli and the consequences of their occurrence. Some contingencies are probabilistic; that is, they predict an event in some situations but not in all. Animal studies suggest that damage to limbic structures or the prefrontal cortex may disturb probabilistic learning. The authors studied the learning of probabilistic contingencies in amnesic patients with limbic lesions, patients with prefrontal cortex damage, and healthy controls. Across 120 trials, participants learned contingent relations between spatial sequences and a button press. Amnesic patients had learning comparable to that of control subjects but failed to indicate what they had learned. Across the last 60 trials, amnesic patients and control subjects learned to avoid a noncontingent choice better than frontal patients. These results indicate that probabilistic learning does not depend on the brain structures supporting declarative memory.

«
1
2
»