16 resultados para reward-and-retention
em Cambridge University Engineering Department Publications Database
Resumo:
Motor task variation has been shown to be a key ingredient in skill transfer, retention, and structural learning. However, many studies only compare training of randomly varying tasks to either blocked or null training, and it is not clear how experiencing different nonrandom temporal orderings of tasks might affect the learning process. Here we study learning in human subjects who experience the same set of visuomotor rotations, evenly spaced between -60° and +60°, either in a random order or in an order in which the rotation angle changed gradually. We compared subsequent learning of three test blocks of +30°→-30°→+30° rotations. The groups that underwent either random or gradual training showed significant (P < 0.01) facilitation of learning in the test blocks compared with a control group who had not experienced any visuomotor rotations before. We also found that movement initiation times in the random group during the test blocks were significantly (P < 0.05) lower than for the gradual or the control group. When we fit a state-space model with fast and slow learning processes to our data, we found that the differences in performance in the test block were consistent with the gradual or random task variation changing the learning and retention rates of only the fast learning process. Such adaptation of learning rates may be a key feature of ongoing meta-learning processes. Our results therefore suggest that both gradual and random task variation can induce meta-learning and that random learning has an advantage in terms of shorter initiation times, suggesting less reliance on cognitive processes.
Resumo:
Establishing a function for the neuromodulator serotonin in human decision-making has proved remarkably difficult because if its complex role in reward and punishment processing. In a novel choice task where actions led concurrently and independently to the stochastic delivery of both money and pain, we studied the impact of decreased brain serotonin induced by acute dietary tryptophan depletion. Depletion selectively impaired both behavioral and neural representations of reward outcome value, and hence the effective exchange rate by which rewards and punishments were compared. This effect was computationally and anatomically distinct from a separate effect on increasing outcome-independent choice perseveration. Our results provide evidence for a surprising role for serotonin in reward processing, while illustrating its complex and multifarious effects.
Resumo:
Reward processing is linked to specific neuromodulatory systems with a dopaminergic contribution to reward learning and motivational drive being well established. Neuromodulatory influences on hedonic responses to actual receipt of reward, or punishment, referred to as experienced utility are less well characterized, although a link to the endogenous opioid system is suggested. Here, in a combined functional magnetic resonance imaging-psychopharmacological investigation, we used naloxone to block central opioid function while subjects performed a gambling task associated with rewards and losses of different magnitudes, in which the mean expected value was always zero. A graded influence of naloxone on reward outcome was evident in an attenuation of pleasure ratings for larger reward outcomes, an effect mirrored in attenuation of brain activity to increasing reward magnitude in rostral anterior cingulate cortex. A more striking effect was seen for losses such that under naloxone all levels of negative outcome were rated as more unpleasant. This hedonic effect was associated with enhanced activity in anterior insula and caudal anterior cingulate cortex, areas implicated in aversive processing. Our data indicate that a central opioid system contributes to both reward and loss processing in humans and directly modulates the hedonic experience of outcomes.
Resumo:
Studies on human monetary prediction and decision making emphasize the role of the striatum in encoding prediction errors for financial reward. However, less is known about how the brain encodes financial loss. Using Pavlovian conditioning of visual cues to outcomes that simultaneously incorporate the chance of financial reward and loss, we show that striatal activation reflects positively signed prediction errors for both. Furthermore, we show functional segregation within the striatum, with more anterior regions showing relative selectivity for rewards and more posterior regions for losses. These findings mirror the anteroposterior valence-specific gradient reported in rodents and endorse the role of the striatum in aversive motivational learning about financial losses, illustrating functional and anatomical consistencies with primary aversive outcomes such as pain.
Resumo:
Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulation. We derive theoretical conditions for successful learning of reward-related behavior for a large class of learning rules where Hebbian synaptic plasticity is conditioned on a global modulatory factor signaling reward. We show that all learning rules in this class can be separated into a term that captures the covariance of neuronal firing and reward and a second term that presents the influence of unsupervised learning. The unsupervised term, which is, in general, detrimental for reward-based learning, can be suppressed if the neuromodulatory signal encodes the difference between the reward and the expected reward-but only if the expected reward is calculated for each task and stimulus separately. If several tasks are to be learned simultaneously, the nervous system needs an internal critic that is able to predict the expected reward for arbitrary stimuli. We show that, with a critic, reward-modulated spike-timing-dependent plasticity is capable of learning motor trajectories with a temporal resolution of tens of milliseconds. The relation to temporal difference learning, the relevance of block-based learning paradigms, and the limitations of learning with a critic are discussed.
Resumo:
When a racing driver steers a car around a sharp bend, there is a trade-off between speed and accuracy, in that high speed can lead to a skid whereas a low speed increases lap time, both of which can adversely affect the driver's payoff function. While speed-accuracy trade-offs have been studied extensively, their susceptibility to risk sensitivity is much less understood, since most theories of motor control are risk neutral with respect to payoff, i.e., they only consider mean payoffs and ignore payoff variability. Here we investigate how individual risk attitudes impact a motor task that involves such a speed-accuracy trade-off. We designed an experiment where a target had to be hit and the reward (given in points) increased as a function of both subjects' endpoint accuracy and endpoint velocity. As faster movements lead to poorer endpoint accuracy, the variance of the reward increased for higher velocities. We tested subjects on two reward conditions that had the same mean reward but differed in the variance of the reward. A risk-neutral account predicts that subjects should only maximize the mean reward and hence perform identically in the two conditions. In contrast, we found that some (risk-averse) subjects chose to move with lower velocities and other (risk-seeking) subjects with higher velocities in the condition with higher reward variance (risk). This behavior is suboptimal with regard to maximizing the mean number of points but is in accordance with a risk-sensitive account of movement selection. Our study suggests that individual risk sensitivity is an important factor in motor tasks with speed-accuracy trade-offs.
Resumo:
We demonstrate modulations of electrical conductance and hysteresis behavior in ZnO nanowire transistors via electrically polarized switching of ferroelectric liquid crystal (FLC). After coating a nanowire channel in the transistors with FLCs, we observed large increases in channel conductance and hysteresis width, and a strong dependence of hysteresis loops on the polarization states associated with the orientation of electric dipole moments along the direction of the gate electric field. Furthermore, the reversible switching and retention characteristics provide the feasibility of creating a hybrid system with switch and memory functions. © 2013 American Institute of Physics.
Resumo:
Mammalian studies show that frustration is experienced when goal-directed activity is blocked. Despite frustration's strongly negative role in health, aggression and social relationships, the neural mechanisms are not well understood. To address this we developed a task in which participants were blocked from obtaining a reward, an established method of producing frustration. Levels of experienced frustration were parametrically varied by manipulating the participants' motivation to obtain the reward prior to blocking. This was achieved by varying the participants' proximity to a reward and the amount of effort expended in attempting to acquire it. In experiment 1, we confirmed that proximity and expended effort independently enhanced participants' self-reported desire to obtain the reward, and their self-reported frustration and response vigor (key-press force) following blocking. In experiment 2, we used functional magnetic resonance imaging (fMRI) to show that both proximity and expended effort modulated brain responses to blocked reward in regions implicated in animal models of reactive aggression, including the amygdala, midbrain periaqueductal grey (PAG), insula and prefrontal cortex. Our findings suggest that frustration may serve an energizing function, translating unfulfilled motivation into aggressive-like surges via a cortical, amygdala and PAG network.
Resumo:
Creep tests at ambient conditions have been carried out on Kevlar 49 and Technora yarns covering a wide stress spectrum (10-70% average breaking load) for a long period of time (up to a year). The results confirm that Kevlar 49 and Technora yarns show a nonlinear behavior at stresses below 40% of the breaking load and a linear behavior at stresses above 40%. The strength retention following creep for Kevlar 49 and Technora has also been examined. The results show a significant difference in the behavior of the two materials. Kevlar 49 appears to lose strength almost linearly with time, while Technora seems to lose strength much more rapidly. These results would have significant implications for design. © 2012 Wiley Periodicals, Inc. J Appl Polym Sci, 2012 Copyright © 2012 Wiley Periodicals, Inc.
Resumo:
Food preferences are acquired through experience and can exert strong influence on choice behavior. In order to choose which food to consume, it is necessary to maintain a predictive representation of the subjective value of the associated food stimulus. Here, we explore the neural mechanisms by which such predictive representations are learned through classical conditioning. Human subjects were scanned using fMRI while learning associations between arbitrary visual stimuli and subsequent delivery of one of five different food flavors. Using a temporal difference algorithm to model learning, we found predictive responses in the ventral midbrain and a part of ventral striatum (ventral putamen) that were related directly to subjects' actual behavioral preferences. These brain structures demonstrated divergent response profiles, with the ventral midbrain showing a linear response profile with preference, and the ventral striatum a bivalent response. These results provide insight into the neural mechanisms underlying human preference behavior.