999 resultados para Reward rate
Resumo:
For a multiarmed bandit problem with exponential discounting the optimal allocation rule is defined by a dynamic allocation index defined for each arm on its space. The index for an arm is equal to the expected immediate reward from the arm, with an upward adjustment reflecting any uncertainty about the prospects of obtaining rewards from the arm, and the possibilities of resolving those uncertainties by selecting that arm. Thus the learning component of the index is defined to be the difference between the index and the expected immediate reward. For two arms with the same expected immediate reward the learning component should be larger for the arm for which the reward rate is more uncertain. This is shown to be true for arms based on independent samples from a fixed distribution with an unknown parameter in the cases of Bernoulli and normal distributions, and similar results are obtained in other cases.
Resumo:
La prise de décision est un processus computationnel fondamental dans de nombreux aspects du comportement animal. Le modèle le plus souvent rencontré dans les études portant sur la prise de décision est appelé modèle de diffusion. Depuis longtemps, il explique une grande variété de données comportementales et neurophysiologiques dans ce domaine. Cependant, un autre modèle, le modèle d’urgence, explique tout aussi bien ces mêmes données et ce de façon parcimonieuse et davantage encrée sur la théorie. Dans ce travail, nous aborderons tout d’abord les origines et le développement du modèle de diffusion et nous verrons comment il a été établi en tant que cadre de travail pour l’interprétation de la plupart des données expérimentales liées à la prise de décision. Ce faisant, nous relèveront ses points forts afin de le comparer ensuite de manière objective et rigoureuse à des modèles alternatifs. Nous réexaminerons un nombre d’assomptions implicites et explicites faites par ce modèle et nous mettrons alors l’accent sur certains de ses défauts. Cette analyse servira de cadre à notre introduction et notre discussion du modèle d’urgence. Enfin, nous présenterons une expérience dont la méthodologie permet de dissocier les deux modèles, et dont les résultats illustrent les limites empiriques et théoriques du modèle de diffusion et démontrent en revanche clairement la validité du modèle d'urgence. Nous terminerons en discutant l'apport potentiel du modèle d'urgence pour l'étude de certaines pathologies cérébrales, en mettant l'accent sur de nouvelles perspectives de recherche.
Resumo:
Although generalist predators have been reported to forage less efficiently than specialists, there is little information on the extent to which learning can improve the efficiency of mixed-prey foraging. Repeated exposure of silver perch to mixed prey (pelagic Artemia and benthic Chironomus larvae) led to substantial fluctuations in reward rate over relatively long (20-day) timescales. When perch that were familiar with a single prey type were offered two prey types simultaneously, the rate at which they captured both familiar and unfamiliar prey dropped progressively over succeeding trials. This result was not predicted by simple learning paradigms, but could be explained in terms of an interaction between learning and attention. Between-trial patterns in overall intake were complex and differed between the two prey types, but were unaffected by previous prey specialization. However, patterns of prey priority (i.e. the prey type that was preferred at the start of a trial) did vary with previous prey training. All groups of fish converged on the most profitable prey type (chironomids), but this process took 15-20 trials. In contrast, fish offered a single prey type reached asymptotic intake rates within five trials and retained high capture abilities for at least 5 weeks. Learning and memory allow fish to maximize foraging efficiency on patches of a single prey type. However, when foragers are faced with mixed prey populations, cognitive constraints associated with divided attention may impair efficiency, and this impairment can be exacerbated by experience. (c) 2005 The Association for the Study of Animal Behaviour. Published by Elsevier Ltd. All rights reserved.
Resumo:
When searching for food, animals often make decisions of where to go, how long to stay in a foraging area and whether or not to return to the last visited spot. These decisions can be enhanced by cognitive traits and adjusted based on previous experience. In social insects such as ants, foraging efficiency have an impact on both individual and colony level. The present study investigated, in the laboratory, the effect of distance from food, capture success and food size, and reward rate on decisions of where to forage in Dinoponera quadriceps, a ponerine ant that forage solitarily and individually make their foraging decisions. We also investigated the influence of learning on the performance of workers over successive trips searching for food by measuring the patch residence time in each foraging trip. Four scenarios were created differing in food reward rates, food size offered and distances colony-food site. Our work has shown that as a rule-of-thumb, workers of D. quadriceps return to the place where a prey item was found on the previous trip, regardless of distance, food size and reward rate. When ants did not capture preys, they were more likely to change path to search for food. However, in one of the scenarios, this decision to switch paths when unsuccessful was less evident, possibly due to the greater variation of possible outcomes ants could experience in this scenario and cognitive constraints of D. quadriceps to predict variations of food distribution. Our results also indicated a learning process of routes of exploration as well as the food site conditions for exploration. After repeated trips, foragers reduced the patch residence time in areas that they did not capture food and quickly changed of foraging area, increasing their foraging efficiency.
Resumo:
Establishing a function for the neuromodulator serotonin in human decision-making has proved remarkably difficult because if its complex role in reward and punishment processing. In a novel choice task where actions led concurrently and independently to the stochastic delivery of both money and pain, we studied the impact of decreased brain serotonin induced by acute dietary tryptophan depletion. Depletion selectively impaired both behavioral and neural representations of reward outcome value, and hence the effective exchange rate by which rewards and punishments were compared. This effect was computationally and anatomically distinct from a separate effect on increasing outcome-independent choice perseveration. Our results provide evidence for a surprising role for serotonin in reward processing, while illustrating its complex and multifarious effects.
Resumo:
Background: Anhedonia, the loss of pleasure in usually enjoyable activities, is a central feature of major depressive disorder (MDD). The aim of the present study was to examine whether young people at a familial risk of depression display signs of anticipatory, motivational or consummatory anhedonia, which would indicate that these deficits may be trait markers for MDD. Methods: The study was completed by 22 participants with a family history of depression (FH+) and 21 controls (HC). Anticipatory anhedonia was assessed by asking participants to rate their anticipated liking of pleasant and unpleasant foods which they imagined tasting when cued with images of the foods. Motivational anhedonia was measured by requiring participants to perform key presses to obtain pleasant chocolate taste rewards or to avoid unpleasant apple tastes. Additionally, physical consummatory anhedonia was examined by instructing participants to rate the pleasantness of the acquired tastes. Moreover, social consummatory anhedonia was investigated by asking participants to make preference-based choices between neutral facial expressions, genuine smiles, and polite smiles. Results: It was found that the FH+ group’s anticipated liking of unpleasant foods was significantly lower than that of the control group. By contrast, no group differences in the pleasantness ratings of the actually experienced tastes or in the amount of performed key presses were observed. However, controls preferred genuine smiles over neutral expressions more often than they preferred polite smiles over neutral expressions, while this pattern was not seen in the FH+ group. Conclusion: These findings suggest that FH+ individuals demonstrate an altered anticipatory response to negative stimuli and show signs of social consummatory anhedonia, which may be trait markers for depression.
Resumo:
We investigate a recently proposed model for decision learning in a population of spiking neurons where synaptic plasticity is modulated by a population signal in addition to reward feedback. For the basic model, binary population decision making based on spike/no-spike coding, a detailed computational analysis is given about how learning performance depends on population size and task complexity. Next, we extend the basic model to n-ary decision making and show that it can also be used in conjunction with other population codes such as rate or even latency coding.
Resumo:
The brain vesicular monoamine transporter (VMAT2) pumps monoamine neurotransmitters and Parkinsonism-inducing dopamine neurotoxins such as 1-methyl-4-phenyl-phenypyridinium (MPP+) from neuronal cytoplasm into synaptic vesicles, from which amphetamines cause their release. Amphetamines and MPP+ each also act at nonvesicular sites, providing current uncertainties about the contributions of vesicular actions to their in vivo effects. To assess vesicular contributions to amphetamine-induced locomotion, amphetamine-induced reward, and sequestration and resistance to dopaminergic neurotoxins, we have constructed transgenic VMAT2 knockout mice. Heterozygous VMAT2 knockouts are viable into adult life and display VMAT2 levels one-half that of wild-type values, accompanied by smaller changes in monoaminergic markers, heart rate, and blood pressure. Weight gain, fertility, habituation, passive avoidance, and locomotor activities are similar to wild-type littermates. In these heterozygotes, amphetamine produces enhanced locomotion but diminished behavioral reward, as measured by conditioned place preference. Administration of the MPP+ precursor N-methyl-4-phenyl-1,2,3,6-tetrahydropyridine to heterozygotes produces more than twice the dopamine cell losses found in wild-type mice. These mice provide novel information about the contributions of synaptic vesicular actions of monoaminergic drugs and neurotoxins and suggest that intact synaptic vesicle function may contribute more to amphetamine-conditioned reward than to amphetamine-induced locomotion.
Resumo:
Raman spectroscopy of formamide-intercalated kaolinites treated using controlled-rate thermal analysis technology (CRTA), allowing the separation of adsorbed formamide from intercalated formamide in formamide-intercalated kaolinites, is reported. The Raman spectra of the CRTA-treated formamide-intercalated kaolinites are significantly different from those of the intercalated kaolinites, which display a combination of both intercalated and adsorbed formamide. An intense band is observed at 3629 cm-1, attributed to the inner surface hydroxyls hydrogen bonded to the formamide. Broad bands are observed at 3600 and 3639 cm-1, assigned to the inner surface hydroxyls, which are hydrogen bonded to the adsorbed water molecules. The hydroxyl-stretching band of the inner hydroxyl is observed at 3621 cm-1 in the Raman spectra of the CRTA-treated formamide-intercalated kaolinites. The results of thermal analysis show that the amount of intercalated formamide between the kaolinite layers is independent of the presence of water. Significant differences are observed in the CO stretching region between the adsorbed and intercalated formamide.
Resumo:
The thermal behaviour of halloysite fully expanded with hydrazine-hydrate has been investigated in nitrogen atmosphere under dynamic heating and at a constant, pre-set decomposition rate of 0.15 mg min-1. Under controlled-rate thermal analysis (CRTA) conditions it was possible to resolve the closely overlapping decomposition stages and to distinguish between adsorbed and bonded reagent. Three types of bonded reagent could be identified. The loosely bonded reagent amounting to 0.20 mol hydrazine-hydrate per mol inner surface hydroxyl is connected to the internal and external surfaces of the expanded mineral and is present as a space filler between the sheets of the delaminated mineral. The strongly bonded (intercalated) hydrazine-hydrate is connected to the kaolinite inner surface OH groups by the formation of hydrogen bonds. Based on the thermoanalytical results two different types of bonded reagent could be distinguished in the complex. Type 1 reagent (approx. 0.06 mol hydrazine-hydrate/mol inner surface OH) is liberated between 77 and 103°C. Type 2 reagent is lost between 103 and 227°C, corresponding to a quantity of 0.36 mol hydrazine/mol inner surface OH. When heating the complex to 77°C under CRTA conditions a new reflection appears in the XRD pattern with a d-value of 9.6 Å, in addition to the 10.2 Ĺ reflection. This new reflection disappears in contact with moist air and the complex re-expands to the original d-value of 10.2 Å in a few h. The appearance of the 9.6 Å reflection is interpreted as the expansion of kaolinite with hydrazine alone, while the 10.2 Å one is due to expansion with hydrazine-hydrate. FTIR (DRIFT) spectroscopic results showed that the treated mineral after intercalation/deintercalation and heat treatment to 300°C is slightly more ordered than the original (untreated) clay.