2 resultados para Learning Choices
em Cambridge University Engineering Department Publications Database
Resumo:
The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors, and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.
Resumo:
The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.