8 resultados para Reinforcement Learning
em CentAUR: Central Archive University of Reading - UK
Resumo:
Researchers at the University of Reading have developed over many years some simple mobile robots that explore an environment they perceive through simple ultrasonic sensors. Information from these sensors has allowed the robots to learn the simple task of moving around while avoiding dynamic obstacles using a static set of fuzzy automata, the choice of which has been criticised, due to its arbitrary nature. This paper considers how a dynamic set of automata can overcome this criticism. In addition, a new reinforcement learning function is outlined which is both scalable to different numbers and types of sensors. The innovations compare successfully with earlier work.
Resumo:
We examined the maturation of decision-making from early adolescence to mid-adulthood using fMRI of a variant of the Iowa gambling task. We have previously shown that performance in this task relies on sensitivity to accumulating negative outcomes in ventromedial PFC and dorsolateral PFC. Here, we further formalize outcome evaluation (as driven by prediction errors [PE], using a reinforcement learning model) and examine its development. Task performance improved significantly during adolescence, stabilizing in adulthood. Performance relied on greater impact of negative compared with positive PEs, the relative impact of which matured from adolescence into adulthood. Adolescents also showed increased exploratory behavior, expressed as a propensity to shift responding between options independently of outcome quality, whereas adults showed no systematic shifting patterns. The correlation between PE representation and improved performance strengthened with age for activation in ventral and dorsal PFC, ventral striatum, and temporal and parietal cortices. There was a medial-lateral distinction in the prefrontal substrates of effective PE utilization between adults and adolescents: Increased utilization of negative PEs, a hallmark of successful performance in the task, was associated with increased activation in ventromedial PFC in adults, but decreased activation in ventrolateral PFC and striatum in adolescents. These results suggest that adults and adolescents engage qualitatively distinct neural and psychological processes during decision-making, the development of which is not exclusively dependent on reward-processing maturation.
Resumo:
Contrary to the widespread belief that people are positively motivated by reward incentives, some studies have shown that performance-based extrinsic reward can actually undermine a person's intrinsic motivation to engage in a task. This “undermining effect” has timely practical implications, given the burgeoning of performance-based incentive systems in contemporary society. It also presents a theoretical challenge for economic and reinforcement learning theories, which tend to assume that monetary incentives monotonically increase motivation. Despite the practical and theoretical importance of this provocative phenomenon, however, little is known about its neural basis. Herein we induced the behavioral undermining effect using a newly developed task, and we tracked its neural correlates using functional MRI. Our results show that performance-based monetary reward indeed undermines intrinsic motivation, as assessed by the number of voluntary engagements in the task. We found that activity in the anterior striatum and the prefrontal areas decreased along with this behavioral undermining effect. These findings suggest that the corticobasal ganglia valuation system underlies the undermining effect through the integration of extrinsic reward value and intrinsic task value.
Resumo:
Perirhinal cortex in monkeys has been thought to be involved in visual associative learning. The authors examined rats' ability to make associations between visual stimuli in a visual secondary reinforcement task. Rats learned 2-choice visual discriminations for secondary visual reinforcement. They showed significant learning of discriminations before any primary reinforcement. Following bilateral perirhinal cortex lesions, rats continued to learn visual discriminations for visual secondary reinforcement at the same rate as before surgery. Thus, this study does not support a critical role of perirhinal cortex in learning for visual secondary reinforcement. Contrasting this result with other positive results, the authors suggest that the role of perirhinal cortex is in "within-object" associations and that it plays a much lesser role in stimulus-stimulus associations between objects.
Resumo:
The authors consider the problem of a robot manipulator operating in a noisy workspace. The manipulator is required to move from an initial position P(i) to a final position P(f). P(i) is assumed to be completely defined. However, P(f) is obtained by a sensing operation and is assumed to be fixed but unknown. The authors approach to this problem involves the use of three learning algorithms, the discretized linear reward-penalty (DLR-P) automaton, the linear reward-penalty (LR-P) automaton and a nonlinear reinforcement scheme. An automaton is placed at each joint of the robot and by acting as a decision maker, plans the trajectory based on noisy measurements of P(f).
Resumo:
The ability to change an established stimulus–behavior association based on feedback is critical for adaptive social behaviors. This ability has been examined in reversal learning tasks, where participants first learn a stimulus–response association (e.g., select a particular object to get a reward) and then need to alter their response when reinforcement contingencies change. Although substantial evidence demonstrates that the OFC is a critical region for reversal learning, previous studies have not distinguished reversal learning for emotional associations from neutral associations. The current study examined whether OFC plays similar roles in emotional versus neutral reversal learning. The OFC showed greater activity during reversals of stimulus–outcome associations for negative outcomes than for neutral outcomes. Similar OFC activity was also observed during reversals involving positive outcomes. Furthermore, OFC activity is more inversely correlated with amygdala activity during negative reversals than during neutral reversals. Overall, our results indicate that the OFC is more activated by emotional than neutral reversal learning and that OFC's interactions with the amygdala are greater for negative than neutral reversal learning.