34 resultados para learning processes


Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although learning a motor skill, such as a tennis stroke, feels like a unitary experience, researchers who study motor control and learning break the processes involved into a number of interacting components. These components can be organized into four main groups. First, skilled performance requires the effective and efficient gathering of sensory information, such as deciding where and when to direct one's gaze around the court, and thus an important component of skill acquisition involves learning how best to extract task-relevant information. Second, the performer must learn key features of the task such as the geometry and mechanics of the tennis racket and ball, the properties of the court surface, and how the wind affects the ball's flight. Third, the player needs to set up different classes of control that include predictive and reactive control mechanisms that generate appropriate motor commands to achieve the task goals, as well as compliance control that specifies, for example, the stiffness with which the arm holds the racket. Finally, the successful performer can learn higher-level skills such as anticipating and countering the opponent's strategy and making effective decisions about shot selection. In this Primer we shall consider these components of motor learning using as an example how we learn to play tennis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents a novel algorithm for learning parameters in statistical dialogue systems which are modeled as Partially Observable Markov Decision Processes (POMDPs). The three main components of a POMDP dialogue manager are a dialogue model representing dialogue state information; a policy that selects the system's responses based on the inferred state; and a reward function that specifies the desired behavior of the system. Ideally both the model parameters and the policy would be designed to maximize the cumulative reward. However, while there are many techniques available for learning the optimal policy, no good ways of learning the optimal model parameters that scale to real-world dialogue systems have been found yet. The presented algorithm, called the Natural Actor and Belief Critic (NABC), is a policy gradient method that offers a solution to this problem. Based on observed rewards, the algorithm estimates the natural gradient of the expected cumulative reward. The resulting gradient is then used to adapt both the prior distribution of the dialogue model parameters and the policy parameters. In addition, the article presents a variant of the NABC algorithm, called the Natural Belief Critic (NBC), which assumes that the policy is fixed and only the model parameters need to be estimated. The algorithms are evaluated on a spoken dialogue system in the tourist information domain. The experiments show that model parameters estimated to maximize the expected cumulative reward result in significantly improved performance compared to the baseline hand-crafted model parameters. The algorithms are also compared to optimization techniques using plain gradients and state-of-the-art random search algorithms. In all cases, the algorithms based on the natural gradient work significantly better. © 2011 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motor learning has been extensively studied using dynamic (force-field) perturbations. These induce movement errors that result in adaptive changes to the motor commands. Several state-space models have been developed to explain how trial-by-trial errors drive the progressive adaptation observed in such studies. These models have been applied to adaptation involving novel dynamics, which typically occurs over tens to hundreds of trials, and which appears to be mediated by a dual-rate adaptation process. In contrast, when manipulating objects with familiar dynamics, subjects adapt rapidly within a few trials. Here, we apply state-space models to familiar dynamics, asking whether adaptation is mediated by a single-rate or dual-rate process. Previously, we reported a task in which subjects rotate an object with known dynamics. By presenting the object at different visual orientations, adaptation was shown to be context-specific, with limited generalization to novel orientations. Here we show that a multiple-context state-space model, with a generalization function tuned to visual object orientation, can reproduce the time-course of adaptation and de-adaptation as well as the observed context-dependent behavior. In contrast to the dual-rate process associated with novel dynamics, we show that a single-rate process mediates adaptation to familiar object dynamics. The model predicts that during exposure to the object across multiple orientations, there will be a degree of independence for adaptation and de-adaptation within each context, and that the states associated with all contexts will slowly de-adapt during exposure in one particular context. We confirm these predictions in two new experiments. Results of the current study thus highlight similarities and differences in the processes engaged during exposure to novel versus familiar dynamics. In both cases, adaptation is mediated by multiple context-specific representations. In the case of familiar object dynamics, however, the representations can be engaged based on visual context, and are updated by a single-rate process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The exploits of Martina Navratilova and Roger Federer represent the pinnacle of motor learning. However, when considering the range and complexity of the processes that are involved in motor learning, even the mere mortals among us exhibit abilities that are impressive. We exercise these abilities when taking up new activities - whether it is snowboarding or ballroom dancing - but also engage in substantial motor learning on a daily basis as we adapt to changes in our environment, manipulate new objects and refine existing skills. Here we review recent research in human motor learning with an emphasis on the computational mechanisms that are involved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The exploits of Martina Navratilova and Roger Federer represent the pinnacle of motor learning. However, when considering the range and complexity of the processes that are involved in motor learning, even the mere mortals among us exhibit abilities that are impressive. We exercise these abilities when taking up new activities-whether it is snowboarding or ballroom dancing-but also engage in substantial motor learning on a daily basis as we adapt to changes in our environment, manipulate new objects and refine existing skills. Here we review recent research in human motor learning with an emphasis on the computational mechanisms that are involved. © 2011 Macmillan Publishers Limited. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to make the learning process faster and (2) to obtain an estimate of the uncertainty of the approximation. We first demonstrate the idea on a simple voice mail dialogue task and then apply this method to a real-world tourist information dialogue task. © 2010 Association for Computational Linguistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of system identification is more robust than finding point estimates of a parametric function representation. Our principled filtering/smoothing approach for GP dynamic systems is based on analytic moment matching in the context of the forward-backward algorithm. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail. © 2011 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural sounds are structured on many time-scales. A typical segment of speech, for example, contains features that span four orders of magnitude: Sentences ($\sim1$s); phonemes ($\sim10$−$1$ s); glottal pulses ($\sim 10$−$2$s); and formants ($\sim 10$−$3$s). The auditory system uses information from each of these time-scales to solve complicated tasks such as auditory scene analysis [1]. One route toward understanding how auditory processing accomplishes this analysis is to build neuroscience-inspired algorithms which solve similar tasks and to compare the properties of these algorithms with properties of auditory processing. There is however a discord: Current machine-audition algorithms largely concentrate on the shorter time-scale structures in sounds, and the longer structures are ignored. The reason for this is two-fold. Firstly, it is a difficult technical problem to construct an algorithm that utilises both sorts of information. Secondly, it is computationally demanding to simultaneously process data both at high resolution (to extract short temporal information) and for long duration (to extract long temporal information). The contribution of this work is to develop a new statistical model for natural sounds that captures structure across a wide range of time-scales, and to provide efficient learning and inference algorithms. We demonstrate the success of this approach on a missing data task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gaussian processes are gaining increasing popularity among the control community, in particular for the modelling of discrete time state space systems. However, it has not been clear how to incorporate model information, in the form of known state relationships, when using a Gaussian process as a predictive model. An obvious example of known prior information is position and velocity related states. Incorporation of such information would be beneficial both computationally and for faster dynamics learning. This paper introduces a method of achieving this, yielding faster dynamics learning and a reduction in computational effort from O(Dn2) to O((D - F)n2) in the prediction stage for a system with D states, F known state relationships and n observations. The effectiveness of the method is demonstrated through its inclusion in the PILCO learning algorithm with application to the swing-up and balance of a torque-limited pendulum and the balancing of a robotic unicycle in simulation. © 2012 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparameteric Bayesian kernel based method to cluster data points without the need to prespecify the number of clusters or to model complicated densities from which data points are assumed to be generated from. The key insight is to use determinants of submatrices of a kernel matrix as a measure of how close together a set of points are. We explore some theoretical properties of the model and derive a natural Gibbs based algorithm with MCMC hyperparameter learning. The model is implemented on a variety of synthetic and real world data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the Student-t process as an alternative to the Gaussian process as a non-parametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the co-variance kernel of a Gaussian process model. We show surprising equivalences between different hierarchical Gaussian process models leading to Student-t processes, and derive a new sampling scheme for the inverse Wishart process, which helps elucidate these equivalences. Overall, we show that a Student-t process can retain the attractive properties of a Gaussian process - a nonparamet-ric representation, analytic marginal and predictive distributions, and easy model selection through covariance kernels - but has enhanced flexibility, and predictive covariances that, unlike a Gaussian process, explicitly depend on the values of training observations. We verify empirically that a Student-t process is especially useful in situations where there are changes in covariance structure, or in applications such as Bayesian optimization, where accurate predictive covariances are critical for good performance. These advantages come at no additional computational cost over Gaussian processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.