8 resultados para Learning behavior
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Learning by reinforcement is important in shaping animal behavior, and in particular in behavioral decision making. Such decision making is likely to involve the integration of many synaptic events in space and time. However, using a single reinforcement signal to modulate synaptic plasticity, as suggested in classical reinforcement learning algorithms, a twofold problem arises. Different synapses will have contributed differently to the behavioral decision, and even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike-time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward, but also by a population feedback signal. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference (TD) based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task, the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second task involves an action sequence which is itself extended in time and reward is only delivered at the last action, as it is the case in any type of board-game. The third task is the inspection game that has been studied in neuroeconomics, where an inspector tries to prevent a worker from shirking. Applying our algorithm to this game yields a learning behavior which is consistent with behavioral data from humans and monkeys, revealing themselves properties of a mixed Nash equilibrium. The examples show that our neuronal implementation of reward based learning copes with delayed and stochastic reward delivery, and also with the learning of mixed strategies in two-opponent games.
Resumo:
Latrepirdine (Dimebon) is a pro-neurogenic, antihistaminic compound that has yielded mixed results in clinical trials of mild to moderate Alzheimer's disease, with a dramatically positive outcome in a Russian clinical trial that was unconfirmed in a replication trial in the United States. We sought to determine whether latrepirdine (LAT)-stimulated amyloid precursor protein (APP) catabolism is at least partially attributable to regulation of macroautophagy, a highly conserved protein catabolism pathway that is known to be impaired in brains of patients with Alzheimer's disease (AD). We utilized several mammalian cellular models to determine whether LAT regulates mammalian target of rapamycin (mTOR) and Atg5-dependent autophagy. Male TgCRND8 mice were chronically administered LAT prior to behavior analysis in the cued and contextual fear conditioning paradigm, as well as immunohistological and biochemical analysis of AD-related neuropathology. Treatment of cultured mammalian cells with LAT led to enhanced mTOR- and Atg5-dependent autophagy. Latrepirdine treatment of TgCRND8 transgenic mice was associated with improved learning behavior and with a reduction in accumulation of Aβ42 and α-synuclein. We conclude that LAT possesses pro-autophagic properties in addition to the previously reported pro-neurogenic properties, both of which are potentially relevant to the treatment and/or prevention of neurodegenerative diseases. We suggest that elucidation of the molecular mechanism(s) underlying LAT effects on neurogenesis, autophagy and behavior might warranty the further study of LAT as a potentially viable lead compound that might yield more consistent clinical benefit following the optimization of its pro-neurogenic, pro-autophagic and/or pro-cognitive activities.
Resumo:
Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.
Resumo:
Disturbances in reward processing have been implicated in bulimia nervosa (BN). Abnormalities in processing reward-related stimuli might be linked to dysfunctions of the catecholaminergic neurotransmitter system, but findings have been inconclusive. A powerful way to investigate the relationship between catecholaminergic function and behavior is to examine behavioral changes in response to experimental catecholamine depletion (CD). The purpose of this study was to uncover putative catecholaminergic dysfunction in remitted subjects with BN who performed a reinforcement-learning task after CD. CD was achieved by oral alpha-methyl-para-tyrosine (AMPT) in 19 unmedicated female subjects with remitted BN (rBN) and 28 demographically matched healthy female controls (HC). Sham depletion administered identical capsules containing diphenhydramine. The study design consisted of a randomized, double-blind, placebo-controlled crossover, single-site experimental trial. The main outcome measures were reward learning in a probabilistic reward task analyzed using signal-detection theory. Secondary outcome measures included self-report assessments, including the Eating Disorder Examination-Questionnaire. Relative to healthy controls, rBN subjects were characterized by blunted reward learning in the AMPT-but not in placebo-condition. Highlighting the specificity of these findings, groups did not differ in their ability to perceptually distinguish between stimuli. Increased CD-induced anhedonic (but not eating disorder) symptoms were associated with a reduced response bias toward a more frequently rewarded stimulus. In conclusion, under CD, rBN subjects showed reduced reward learning compared with healthy control subjects. These deficits uncover disturbance of the central reward processing systems in rBN related to altered brain catecholamine levels, which might reflect a trait-like deficit increasing vulnerability to BN.
Resumo:
Rationale: To provide a better understanding of cognitive functioning, motor outcome, behavior and quality of life after childhood stroke and to study the relationship between variables expected to influence rehabilitation and outcome (age at stroke, time elapsed since stroke, lateralization, location and size of lesion). Methods: Children who suffered from stroke between birth and their eighteenth year of life underwent an assessment consisting of cognitive tests (WISC-III, WAIS-R, K-ABC, TAP, Rey-Figure, German Version of the CVLT) and questionnaires (Conner's Scales, KIDSCREEN). Results: Twenty-one patients after stroke in childhood (15 males, mean 11;11 years, SD 4;3, range 6;10-21;2) participated in the study. Mean Intelligence Quotients (IQ) were situated within the normal range (mean Full Scale IQ 96.5, range IQ 79-129). However, significantly more patients showed deficits in various cognitive domains than expected from a healthy population (Performance IQ p = .000; Digit Span p = .000, Arithmetic's p = .007, Divided Attention p = .028, Alertness p = .002). Verbal IQ was significantly better than Performance IQ in 13 of 17 patients, independent of the hemispheric side of lesion. Symptoms of ADHD occurred more often in the patients' sample than in a healthy population (learning difficulties/inattention p = .000; impulsivity/hyperactivity p = .006; psychosomatics p = .006). Certain aspects of quality of life were reduced (autonomy p = .003; parents' relation p = .003; social acceptance p = .037). Three patients had a right-sided hemiparesis, mean values of motor functions of the other patients were slightly impaired (sequential finger movements p = .000, hand alternation p = .001, foot tapping p = .043). In patients without hemiparesis, there was no relation between the lateralization of lesion and motor outcome. Lesion that occurred in the midst of childhood (5-10 years) led to better cognitive outcome than lesion in the very early (0-5 years) or late childhood (10-18 years). Other variables such as presence of seizure, elapsed time since stroke and size of lesion had a small to no impact on prognosis. Conclusion: Moderate cognitive and motor deficits, behavioral problems, and impairment in some aspects of quality of life frequently remain after stroke in childhood. Visuospatial functions are more often reduced than verbal functions, independent of the hemispheric side of lesion. This indicates a functional superiority of verbal skills compared to visuospatial skills in the process of recovery after brain injury. Compared to the cognitive outcome following stroke in adults, cognitive sequelae after childhood stroke do indicate neither the lateralization nor the location of the lesion focus. Age at stroke seems to be the only determining factor influencing cognitive outcome.
Resumo:
Given the complex structure of the brain, how can synaptic plasticity explain the learning and forgetting of associations when these are continuously changing? We address this question by studying different reinforcement learning rules in a multilayer network in order to reproduce monkey behavior in a visuomotor association task. Our model can only reproduce the learning performance of the monkey if the synaptic modifications depend on the pre- and postsynaptic activity, and if the intrinsic level of stochasticity is low. This favored learning rule is based on reward modulated Hebbian synaptic plasticity and shows the interesting feature that the learning performance does not substantially degrade when adding layers to the network, even for a complex problem.
Resumo:
Background: A relationship between bulimia nervosa (BN) and reward-related behavior is supported by several lines of evidence. The dopaminergic dysfunctions in the processing of reward-related stimuli have been shown to be modulated by the neurotrophin brain derived neurotrophic factor (BDNF) and the hormone leptin. Methods: Using a randomized, double-blind, placebo-controlled, crossover design, a reward learning task was applied to study the behavior of 20 female subjects with remitted BN (rBN) and 27 female healthy controls under placebo and catecholamine depletion with alpha-methyl-para-tyrosine (AMPT). The plasma levels of BDNF and leptin were measured twice during the placebo and the AMPT condition, immediately before and 1 h after a standardized breakfast. Results: AMPT-induced differences in plasma BDNF levels were positively correlated with the AMPT-induced differences in reward learning in the whole sample (p = 0.05). Across conditions, plasma BDNF levels were higher in rBN subjects compared to controls (diagnosis effect; p = 0.001). Plasma BDNF and leptin levels were higher in the morning before compared to after a standardized breakfast across groups and conditions (time effect; p < 0.0001). The plasma leptin levels were higher under catecholamine depletion compared to placebo in the whole sample (treatment effect; p = 0.0004). Conclusions: This study reports on preliminary findings that suggest a catecholamine-dependent association of plasma BDNF and reward learning in subjects with rBN and controls. A role of leptin in reward learning is not supported by this study. However, leptin levels were sensitive to a depletion of catecholamine stores in both rBN and controls.