11 resultados para Learning behavior
em Cambridge University Engineering Department Publications Database
Resumo:
The present study investigated the relationship between statistics anxiety, individual characteristics (e.g., trait anxiety and learning strategies), and academic performance. Students enrolled in a statistics course in psychology (N=147) filled in a questionnaire on statistics anxiety, trait anxiety, interest in statistics, mathematical selfconcept, learning strategies, and procrastination. Additionally, their performance in the examination was recorded. The structural equation model showed that statistics anxiety held a crucial role as the strongest direct predictor of performance. Students with higher statistics anxiety achieved less in the examination and showed higher procrastination scores. Statistics anxiety was related indirectly to spending less effort and time on learning. Trait anxiety was related positively to statistics anxiety and, counterintuitively, to academic performance. This result can be explained by the heterogeneity of the measure of trait anxiety. The part of trait anxiety that is unrelated to the specific part of statistics anxiety correlated positively with performance.
Resumo:
The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.
Resumo:
Computer simulation experiments were performed to examine the effectiveness of OR- and comparative-reinforcement learning algorithms. In the simulation, human rewards were given as +1 and -1. Two models of human instruction that determine which reward is to be given in every step of a human instruction were used. Results show that human instruction may have a possibility of including both model-A and model-B characteristics, and it can be expected that the comparative-reinforcement learning algorithm is more effective for learning by human instructions.
Resumo:
Picking up an empty milk carton that we believe to be full is a familiar example of adaptive control, because the adaptation process of estimating the carton's weight must proceed simultaneously with the control process of moving the carton to a desired location. Here we show that the motor system initially generates highly variable behavior in such unpredictable tasks but eventually converges to stereotyped patterns of adaptive responses predicted by a simple optimality principle. These results suggest that adaptation can become specifically tuned to identify task-specific parameters in an optimal manner.
Resumo:
Deep belief networks are a powerful way to model complex probability distributions. However, learning the structure of a belief network, particularly one with hidden units, is difficult. The Indian buffet process has been used as a nonparametric Bayesian prior on the directed structure of a belief network with a single infinitely wide hidden layer. In this paper, we introduce the cascading Indian buffet process (CIBP), which provides a nonparametric prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network so each unit can additionally vary its behavior between discrete and continuous representations. We provide Markov chain Monte Carlo algorithms for inference in these belief networks and explore the structures learned on several image data sets.
Resumo:
This article presents a novel algorithm for learning parameters in statistical dialogue systems which are modeled as Partially Observable Markov Decision Processes (POMDPs). The three main components of a POMDP dialogue manager are a dialogue model representing dialogue state information; a policy that selects the system's responses based on the inferred state; and a reward function that specifies the desired behavior of the system. Ideally both the model parameters and the policy would be designed to maximize the cumulative reward. However, while there are many techniques available for learning the optimal policy, no good ways of learning the optimal model parameters that scale to real-world dialogue systems have been found yet. The presented algorithm, called the Natural Actor and Belief Critic (NABC), is a policy gradient method that offers a solution to this problem. Based on observed rewards, the algorithm estimates the natural gradient of the expected cumulative reward. The resulting gradient is then used to adapt both the prior distribution of the dialogue model parameters and the policy parameters. In addition, the article presents a variant of the NABC algorithm, called the Natural Belief Critic (NBC), which assumes that the policy is fixed and only the model parameters need to be estimated. The algorithms are evaluated on a spoken dialogue system in the tourist information domain. The experiments show that model parameters estimated to maximize the expected cumulative reward result in significantly improved performance compared to the baseline hand-crafted model parameters. The algorithms are also compared to optimization techniques using plain gradients and state-of-the-art random search algorithms. In all cases, the algorithms based on the natural gradient work significantly better. © 2011 ACM.
Resumo:
Motor learning has been extensively studied using dynamic (force-field) perturbations. These induce movement errors that result in adaptive changes to the motor commands. Several state-space models have been developed to explain how trial-by-trial errors drive the progressive adaptation observed in such studies. These models have been applied to adaptation involving novel dynamics, which typically occurs over tens to hundreds of trials, and which appears to be mediated by a dual-rate adaptation process. In contrast, when manipulating objects with familiar dynamics, subjects adapt rapidly within a few trials. Here, we apply state-space models to familiar dynamics, asking whether adaptation is mediated by a single-rate or dual-rate process. Previously, we reported a task in which subjects rotate an object with known dynamics. By presenting the object at different visual orientations, adaptation was shown to be context-specific, with limited generalization to novel orientations. Here we show that a multiple-context state-space model, with a generalization function tuned to visual object orientation, can reproduce the time-course of adaptation and de-adaptation as well as the observed context-dependent behavior. In contrast to the dual-rate process associated with novel dynamics, we show that a single-rate process mediates adaptation to familiar object dynamics. The model predicts that during exposure to the object across multiple orientations, there will be a degree of independence for adaptation and de-adaptation within each context, and that the states associated with all contexts will slowly de-adapt during exposure in one particular context. We confirm these predictions in two new experiments. Results of the current study thus highlight similarities and differences in the processes engaged during exposure to novel versus familiar dynamics. In both cases, adaptation is mediated by multiple context-specific representations. In the case of familiar object dynamics, however, the representations can be engaged based on visual context, and are updated by a single-rate process.
Resumo:
Legged locomotion of biological systems can be viewed as a self-organizing process of highly complex system-environment interactions. Walking behavior is, for example, generated from the interactions between many mechanical components (e.g., physical interactions between feet and ground, skeletons and muscle-tendon systems), and distributed informational processes (e.g., sensory information processing, sensory-motor control in central nervous system, and reflexes) [21]. An interesting aspect of legged locomotion study lies in the fact that there are multiple levels of self-organization processes (at the levels of mechanical dynamics, sensory-motor control, and learning). Previously, the self-organization of mechanical dynamics was nicely demonstrated by the so-called Passive Dynamic Walkers (PDWs; [18]). The PDW is a purely mechanical structure consisting of body, thigh, and shank limbs that are connected by passive joints. When placed on a shallow slope, it exhibits natural bipedal walking dynamics by converting potential to kinetic energy without any actuation. An important contribution of these case studies is that, if designed properly, mechanical dynamics can generate a relatively complex locomotion dynamics, on the one hand, and the mechanical dynamics induces self-stability against small disturbances without any explicit control of motors, on the other. The basic principle of the mechanical self-stability appears to be fairly general that there are several different physics models that exhibit similar characteristics in different kinds of behaviors (e.g., hopping, running, and swimming; [2, 4, 9, 16, 19]), and a number of robotic platforms have been developed based on them [1, 8, 13, 22]. © 2009 Springer London.