66 resultados para learning in play-based environments

em Cambridge University Engineering Department Publications Database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel framework is provided for very fast model-based reinforcement learning in continuous state and action spaces. It requires probabilistic models that explicitly characterize their levels of condence. Within the framework, exible, non-parametric models are used to describe the world based on previously collected experience. It demonstrates learning on the cart-pole problem in a setting where very limited prior knowledge about the task has been provided. Learning progressed rapidly, and a good policy found after only a small number of iterations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning is often understood as an organism's gradual acquisition of the association between a given sensory stimulus and the correct motor response. Mathematically, this corresponds to regressing a mapping between the set of observations and the set of actions. Recently, however, it has been shown both in cognitive and motor neuroscience that humans are not only able to learn particular stimulus-response mappings, but are also able to extract abstract structural invariants that facilitate generalization to novel tasks. Here we show how such structure learning can enhance facilitation in a sensorimotor association task performed by human subjects. Using regression and reinforcement learning models we show that the observed facilitation cannot be explained by these basic models of learning stimulus-response associations. We show, however, that the observed data can be explained by a hierarchical Bayesian model that performs structure learning. In line with previous results from cognitive tasks, this suggests that hierarchical Bayesian inference might provide a common framework to explain both the learning of specific stimulus-response associations and the learning of abstract structures that are shared by different task environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

'Learning to learn' phenomena have been widely investigated in cognition, perception and more recently also in action. During concept learning tasks, for example, it has been suggested that characteristic features are abstracted from a set of examples with the consequence that learning of similar tasks is facilitated-a process termed 'learning to learn'. From a computational point of view such an extraction of invariants can be regarded as learning of an underlying structure. Here we review the evidence for structure learning as a 'learning to learn' mechanism, especially in sensorimotor control where the motor system has to adapt to variable environments. We review studies demonstrating that common features of variable environments are extracted during sensorimotor learning and exploited for efficient adaptation in novel tasks. We conclude that structure learning plays a fundamental role in skill learning and may underlie the unsurpassed flexibility and adaptability of the motor system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The technique presented in this paper enables a simple, accurate and unbiased measurement of hand stiffness during human arm movements. Using a computer-controlled mechanical interface, the hand is shifted relative to a prediction of the undisturbed trajectory. Stiffness is then computed as the restoring force divided by the position amplitude of the perturbation. A precise prediction algorithm insures the measurement quality. We used this technique to measure stiffness in free movements and after adaptation to a linear velocity dependent force field. The subjects compensated for the external force by co-contracting muscles selectively. The stiffness geometry changed with learning and stiffness tended to increase in the direction of the external force.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to use environmental stimuli to predict impending harm is critical for survival. Such predictions should be available as early as they are reliable. In pavlovian conditioning, chains of successively earlier predictors are studied in terms of higher-order relationships, and have inspired computational theories such as temporal difference learning. However, there is at present no adequate neurobiological account of how this learning occurs. Here, in a functional magnetic resonance imaging (fMRI) study of higher-order aversive conditioning, we describe a key computational strategy that humans use to learn predictions about pain. We show that neural activity in the ventral striatum and the anterior insula displays a marked correspondence to the signals for sequential learning predicted by temporal difference models. This result reveals a flexible aversive learning process ideally suited to the changing and uncertain nature of real-world environments. Taken with existing data on reward learning, our results suggest a critical role for the ventral striatum in integrating complex appetitive and aversive predictions to coordinate behaviour.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite many approaches proposed in the past, robotic climbing in a complex vertical environment is still a big challenge. We present here an alternative climbing technology that is based on thermoplastic adhesive (TPA) bonds. The approach has a great advantage because of its large payload capacity and viability to a wide range of flat surfaces and complex vertical terrains. The large payload capacity comes from a physical process of thermal bonding, while the wide applicability benefits from rheological properties of TPAs at higher temperatures and intermolecular forces between TPAs and adherends when being cooled down. A particular type of TPA has been used in combination with two robotic platforms, featuring different foot designs, including heating/cooling methods and construction of footpads. Various experiments have been conducted to quantitatively assess different aspects of the approach. Results show that an exceptionally high ratio of 500% between dynamic payloads and body mass can be achieved for stable and repeatable vertical climbing on flat surfaces at a low speed. Assessments on four types of typical complex vertical terrains with a measure, i.e., terrain shape index ranging from -0.114 to 0.167, return a universal success rate of 80%-100%. © 2004-2012 IEEE.