208 resultados para prediction error
Resumo:
Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions. These theories highlight a central role for reward prediction errors in updating the values associated with available actions. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-L-phenylalanine; L-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with L-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard action-value learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.
Resumo:
This paper examines the sources of uncertainly in models used to predict vibration from underground railways. It will become clear from this presentation that by varying parameters by a small amount, consistent with uncertainties in measured data, the predicted vibration levels vary significantly, often by more than 10dB. This error cannot be forecast. Small changes made to soil parameters (Compressive and Shear Wave velocities and density), to slab bending stiffness and mass and to the measurement position give rise to changes in vibration levels of more than lOdB. So if 10dB prediction error results from small uncertainties in soil parameters and measurement position it cannot be sensible to rely on prediction models for accuracy better than 10dB. The presentation will demonstrate in real time the use of the new - and freely-available - PiP software for calculating vibration from railway tunnels in real time.
Resumo:
Recent developments in modeling driver steering control with preview are reviewed. While some validation with experimental data has been presented, the rigorous application of formal system identification methods has not yet been attempted. This paper describes a steering controller based on linear model-predictive control. An indirect identification method that minimizes steering angle prediction error is developed. Special attention is given to filtering the prediction error so as to avoid identification bias that arises from the closed-loop operation of the driver-vehicle system. The identification procedure is applied to data collected from 14 test drivers performing double lane change maneuvers in an instrumented vehicle. It is found that the identification procedure successfully finds parameter values for the model that give small prediction errors. The procedure is also able to distinguish between the different steering strategies adopted by the test drivers. © 2006 IEEE.
Resumo:
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic plasticity. On the other hand, the theory of reinforcement learning provides a framework for reward-based learning. Recent models of reward-modulated spike-timing-dependent plasticity have made first steps towards bridging the gap between the two approaches, but faced two problems. First, reinforcement learning is typically formulated in a discrete framework, ill-adapted to the description of natural situations. Second, biologically plausible models of reward-modulated spike-timing-dependent plasticity require precise calculation of the reward prediction error, yet it remains to be shown how this can be computed by neurons. Here we propose a solution to these problems by extending the continuous temporal difference (TD) learning of Doya (2000) to the case of spiking neurons in an actor-critic network operating in continuous time, and with continuous state and action representations. In our model, the critic learns to predict expected future rewards in real time. Its activity, together with actual rewards, conditions the delivery of a neuromodulatory TD signal to itself and to the actor, which is responsible for action choice. In simulations, we show that such an architecture can solve a Morris water-maze-like navigation task, in a number of trials consistent with reported animal performance. We also use our model to solve the acrobot and the cartpole problems, two complex motor control tasks. Our model provides a plausible way of computing reward prediction error in the brain. Moreover, the analytically derived learning rule is consistent with experimental evidence for dopamine-modulated spike-timing-dependent plasticity.
Resumo:
Motor control strongly relies on neural processes that predict the sensory consequences of self-generated actions. Previous research has demonstrated deficits in such sensory-predictive processes in schizophrenic patients and these low-level deficits are thought to contribute to the emergence of delusions of control. Here, we examined the extent to which individual differences in sensory prediction are associated with a tendency towards delusional ideation in healthy participants. We used a force-matching task to quantify sensory-predictive processes, and administered questionnaires to assess schizotypy and delusion-like thinking. Individuals with higher levels of delusional ideation showed more accurate force matching suggesting that such thinking is associated with a reduced tendency to predict and attenuate the sensory consequences of self-generated actions. These results suggest that deficits in sensory prediction in schizophrenia are not simply consequences of the deluded state and are not related to neuroleptic medication. Rather they appear to be stable, trait-like characteristics of an individual, a finding that has important implications for our understanding of the neurocognitive basis of delusions.
Resumo:
The non-deterministic relationship between Bit Error Rate and Packet Error Rate is demonstrated for an optical media access layer in common use. We show that frequency components of coded, non-random data can cause this relationship. © 2005 Optical Society of America.
Resumo:
The work in this paper forms part of a project on the use of large eddy simulation (LES) for broadband rotor-stator interaction noise prediction. Here we focus on LES of the flow field near a fan blade trailing edge. The first part of the paper aims to evaluate LES suitability for predicting the near-field velocity field for a blunt NACA-0012 airfoil at moderate Reynolds numbers (2× 10 5 and 4× 10 5). Preliminary computations of turbulent mean and root-mean-square velocities, as well as energy spectra at the trailing edge, are compared with those from a recent experiment.1 The second part of the paper describes preliminary progress on an LES calculation of the fan wakes on a fan rig. 2 The CFD code uses a mixed element unstructured mesh with a median dual control volume. A wall-adapting local eddy-viscosity sub-grid scale model is employed. A very small amount of numerical dissipation is added in the numerical scheme to keep the compressible solver stable. Further results for the fan turbulentmean and RMS velocity, and especially the aeroacoustics field will be presented at a later stage. Copyright © 2008 by Qinling LI, Nigel Peake & Mark Savill.