24 resultados para MODEL-FREE

em Cambridge University Engineering Department Publications Database


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors, and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Model-based and model-free controllers can, in principle, learn arbitrary actions to optimize their behavior, at least those actions that can be expressed and explored. Indeed, these are often referred to as instrumental controllers because their choices are learned to be instrumental for the delivery of desired outcomes. Although this flexibility is very powerful, it comes with an attendant cost of learning. Evolution appears to have endowed everything from the simplest organisms to us with powerful, pre-specified, but inflexible alternatives. These responses are termed Pavlovian, after the famous Russian physiologist and psychologist Pavlov. The responses of the Pavlovian controller are determined by evolutionary (phylogenetic) considerations rather than (ontogenetic) aspects of the contingent development or learning of an individual. These responses directly interact with instrumental choices arising from goal-directed and habitual controllers. This interaction has been studied in a wealth of animal paradigms, and can be helpful, neutral, or harmful, according to circumstance. Although there has been less careful or analytical study of it in humans, it can be interpreted as underpinning a wealth of behavioral aberrations. © 2009 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

There is much to gain from providing walking machines with passive dynamics, e.g. by including compliant elements in the structure. These elements can offer interesting properties such as self-stabilization, energy efficiency and simplified control. However, there is still no general design strategy for such robots and their controllers. In particular, the calibration of control parameters is often complicated because of the highly nonlinear behavior of the interactions between passive components and the environment. In this article, we propose an approach in which the calibration of a key parameter of a walking controller, namely its intrinsic frequency, is done automatically. The approach uses adaptive frequency oscillators to automatically tune the intrinsic frequency of the oscillators to the resonant frequency of a compliant quadruped robot The tuning goes beyond simple synchronization and the learned frequency stays in the controller when the robot is put to halt. The controller is model free, robust and simple. Results are presented illustrating how the controller can robustly tune itself to the robot, as well as readapt when the mass of the robot is changed. We also provide an analysis of the convergence of the frequency adaptation for a linearized plant, and show how that analysis is useful for determining which type of sensory feedback must be used for stable convergence. This approach is expected to explain some aspects of developmental processes in biological and artificial adaptive systems that "develop" through the embodied system-environment interactions. © 2006 IEEE.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a practical destruction-free parameter extraction methodology for a new physics-based circuit simulator buffer-layer Integrated Gate Commutated Thyristor (IGCT) model. Most key parameters needed for this model can be extracted by one simple clamped inductive-load switching experiment. To validate this extraction method, a clamped inductive load switching experiment was performed, and corresponding simulations were carried out by employing the IGCT model with parameters extracted through the presented methodology. Good agreement has been obtained between the experimental data and simulation results.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recent developments in modeling driver steering control with preview are reviewed. While some validation with experimental data has been presented, the rigorous application of formal system identification methods has not yet been attempted. This paper describes a steering controller based on linear model-predictive control. An indirect identification method that minimizes steering angle prediction error is developed. Special attention is given to filtering the prediction error so as to avoid identification bias that arises from the closed-loop operation of the driver-vehicle system. The identification procedure is applied to data collected from 14 test drivers performing double lane change maneuvers in an instrumented vehicle. It is found that the identification procedure successfully finds parameter values for the model that give small prediction errors. The procedure is also able to distinguish between the different steering strategies adopted by the test drivers. © 2006 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a method for text entry based on inverse arithmetic coding that relies on gaze direction and which is faster and more accurate than using an on-screen keyboard. These benefits are derived from two innovations: the writing task is matched to the capabilities of the eye, and a language model is used to make predictable words and phrases easier to write.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using an entropy argument, it is shown that stochastic context-free grammars (SCFG's) can model sources with hidden branching processes more efficiently than stochastic regular grammars (or equivalently HMM's). However, the automatic estimation of SCFG's using the Inside-Outside algorithm is limited in practice by its O(n3) complexity. In this paper, a novel pre-training algorithm is described which can give significant computational savings. Also, the need for controlling the way that non-terminals are allocated to hidden processes is discussed and a solution is presented in the form of a grammar minimization procedure. © 1990.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes two applications in speech recognition of the use of stochastic context-free grammars (SCFGs) trained automatically via the Inside-Outside Algorithm. First, SCFGs are used to model VQ encoded speech for isolated word recognition and are compared directly to HMMs used for the same task. It is shown that SCFGs can model this low-level VQ data accurately and that a regular grammar based pre-training algorithm is effective both for reducing training time and obtaining robust solutions. Second, an SCFG is inferred from a transcription of the speech used to train a phoneme-based recognizer in an attempt to model phonotactic constraints. When used as a language model, this SCFG gives improved performance over a comparable regular grammar or bigram. © 1991.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A complete optical system model has been developed and used to assess chirped fibre Bragg grating dispersion compensators. Gratings suitable for dispersion compensation in both laser based and modulator based optical communications systems have been modelled. A grating 10 cm in length has been shown to permit virtually dispersion free transmission over 425 km, when used in an externally modulated system. Long haul dispersion compensation using several 2 cm gratings spaced at intervals along the fibre is also modelled, illustrating viable 10Gbit/s transmission over a distance in excess of 168 km.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of automated design optimization to real-world, complex geometry problems is a significant challenge - especially if the topology is not known a priori like in turbine internal cooling. The long term goal of our work is to focus on an end-to-end integration of the whole CFD Process, from solid model through meshing, solving and post-processing to enable this type of design optimization to become viable & practical. In recent papers we have reported the integration of a Level Set based geometry kernel with an octree-based cut- Cartesian mesh generator, RANS flow solver, post-processing & geometry editing all within a single piece of software - and all implemented in parallel with commodity PC clusters as the target. The cut-cells which characterize the approach are eliminated by exporting a body-conformal mesh guided by the underpinning Level Set. This paper extends this work still further with a simple scoping study showing how the basic functionality can be scripted & automated and then used as the basis for automated optimization of a generic gas turbine cooling geometry. Copyright © 2008 by W.N.Dawes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Model-based approaches to handle additive and convolutional noise have been extensively investigated and used. However, the application of these schemes to handling reverberant noise has received less attention. This paper examines the extension of two standard additive/convolutional noise approaches to handling reverberant noise. The first is an extension of vector Taylor series (VTS) compensation, reverberant VTS, where a mismatch function including reverberant noise is used. The second scheme modifies constrained MLLR to allow a wide-span of frames to be taken into account and projected into the required dimensionality. To allow additive noise to be handled, both these schemes are combined with standard VTS. The approaches are evaluated and compared on two tasks, MC-WSJ-AV, and a reverberant simulated version of AURORA-4. © 2011 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When a thin rectangular plate is restrained on the two long edges and free on the remaining edges, the equivalent stiffness of the restraining joints can be identified by the order of the natural frequencies obtained using the free response of the plate at a single location. This work presents a method to identify the equivalent stiffness of the restraining joints, being represented as simply supporting the plate but elastically restraining it in rotation. An integral transform is used to map the autospectrum of the free response from the frequency domain to the stiffness domain in order to identify the equivalent torsional stiffness of the restrained edges of the plate and also the order of natural frequencies. The kernel of the integral transform is built interpolating data from a finite element model of the plate. The method introduced in this paper can also be applied to plates or shells with different shapes and boundary conditions. © 2011 Elsevier Ltd. All rights reserved.