11 resultados para action control
em Cambridge University Engineering Department Publications Database
Resumo:
A novel framework is provided for very fast model-based reinforcement learning in continuous state and action spaces. It requires probabilistic models that explicitly characterize their levels of condence. Within the framework, exible, non-parametric models are used to describe the world based on previously collected experience. It demonstrates learning on the cart-pole problem in a setting where very limited prior knowledge about the task has been provided. Learning progressed rapidly, and a good policy found after only a small number of iterations.
Resumo:
This paper extends the recently developed multiplexed model predictive control (MMPC) concept to ensure satisfaction of hard constraints despite the action of persistent, unknown but bounded disturbances. MMPC uses asynchronous control moves on each input channel instead of synchronised moves on all channels. It offers reduced computation, by dividing the online optimisation into a smaller problem for each channel, and potential performance improvements, as the response to a disturbance is quicker, albeit via only one channel. Robustness to disturbances is introduced using the constraint tightening approach, tailored to suit the asynchronous updates of MMPC and the resulting time-varying optimisations. Numerical results are presented, involving a simple mechanical example and an aircraft control example, showing the potential computational and performance benefits of the new robust MMPC.
Resumo:
Numerous psychophysical studies suggest that the sensorimotor system chooses actions that optimize the average cost associated with a movement. Recently, however, violations of this hypothesis have been reported in line with economic theories of decision-making that not only consider the mean payoff, but are also sensitive to risk, that is the variability of the payoff. Here, we examine the hypothesis that risk-sensitivity in sensorimotor control arises as a mean-variance trade-off in movement costs. We designed a motor task in which participants could choose between a sure motor action that resulted in a fixed amount of effort and a risky motor action that resulted in a variable amount of effort that could be either lower or higher than the fixed effort. By changing the mean effort of the risky action while experimentally fixing its variance, we determined indifference points at which participants chose equiprobably between the sure, fixed amount of effort option and the risky, variable effort option. Depending on whether participants accepted a variable effort with a mean that was higher, lower or equal to the fixed effort, they could be classified as risk-seeking, risk-averse or risk-neutral. Most subjects were risk-sensitive in our task consistent with a mean-variance trade-off in effort, thereby, underlining the importance of risk-sensitivity in computational models of sensorimotor control.
Resumo:
'Learning to learn' phenomena have been widely investigated in cognition, perception and more recently also in action. During concept learning tasks, for example, it has been suggested that characteristic features are abstracted from a set of examples with the consequence that learning of similar tasks is facilitated-a process termed 'learning to learn'. From a computational point of view such an extraction of invariants can be regarded as learning of an underlying structure. Here we review the evidence for structure learning as a 'learning to learn' mechanism, especially in sensorimotor control where the motor system has to adapt to variable environments. We review studies demonstrating that common features of variable environments are extracted during sensorimotor learning and exploited for efficient adaptation in novel tasks. We conclude that structure learning plays a fundamental role in skill learning and may underlie the unsurpassed flexibility and adaptability of the motor system.
Resumo:
Common-rail fuel injection systems on modern light duty diesel engines are effectively able to respond instantaneously to changes in the demanded injection quantity. In contrast, the air-system is subject to significantly slower dynamics, primarily due to filling/emptying effects in the manifolds and turbocharger inertia. The behaviour of the air-path in a diesel engine is therefore the main limiting factor in terms of engine-out emissions during transient operation. This paper presents a simple mean-value model for the air-path during throttled operation, which is used to design a feed-forward controller that delivers very rapid changes in the in-cylinder charge properties. The feed-forward control action is validated using a state-of-the-art sampling system that allows true cycle-by-cycle measurement of the in-cylinder CO2 concentration. © 2011 SAE International.
Resumo:
Motivational theories of pain highlight its role in people's choices of actions that avoid bodily damage. By contrast, little is known regarding how pain influences action implementation. To explore this less-understood area, we conducted a study in which participants had to rapidly point to a target area to win money while avoiding an overlapping penalty area that would cause pain in their contralateral hand. We found that pain intensity and target-penalty proximity repelled participants' movement away from pain and that motor execution was influenced not by absolute pain magnitudes but by relative pain differences. Our results indicate that the magnitude and probability of pain have a precise role in guiding motor control and that representations of pain that guide action are, at least in part, relative rather than absolute. Additionally, our study shows that the implicit monetary valuation of pain, like many explicit valuations (e.g., patients' use of rating scales in medical contexts), is unstable, a finding that has implications for pain treatment in clinical contexts.
Resumo:
The paper is concerned with the identification of theoretical preview steering controllers using data obtained from five test subjects in a fixed-base driving simulator. An understanding of human steering control behaviour is relevant to the design of autonomous and semi-autonomous vehicle controls. The driving task involved steering a linear vehicle along a randomly curving path. The theoretical steering controllers identified from the data were based on optimal linear preview control. A direct-identification method was used, and the steering controllers were identified so that the predicted steering angle matched as closely as possible the measured steering angle of the test subjects. It was found that identification of the driver's time delay and noise is necessary to avoid bias in identification of the controller parameters. Most subjects' steering behaviour was predicted well by a theoretical controller based on the lateral/yaw dynamics of the vehicle. There was some evidence that an inexperienced driver's steering action was better represented by a controller based on a simpler model of the vehicle dynamics, perhaps reflecting incomplete learning by the driver. Copyright © 2014 Inderscience Enterprises Ltd.
Resumo:
In this paper, we develop a linear technique that predicts how the stability of a thermo-acoustic system changes due to the action of a generic passive feedback device or a generic change in the base state. From this, one can calculate the passive device or base state change that most stabilizes the system. This theoretical framework, based on adjoint equations, is applied to two types of Rijke tube. The first contains an electrically-heated hot wire and the second contains a diffusion flame. Both heat sources are assumed to be compact so that the acoustic and heat release models can be decoupled. We find that the most effective passive control device is an adiabatic mesh placed at the downstream end of the Rijke tube. We also investigate the effects of a second hot wire and a local variation of the cross-sectional area but find that both affect the frequency more than the growth rate. This application of adjoint sensitivity analysis opens up new possibilities for the passive control of thermo-acoustic oscillations. For example, the influence of base state changes can be combined with other constraints, such as that the total heat release rate remains constant, in order to show how an unstable thermo-acoustic system should be changed in order to make it stable. Copyright © 2013 by ASME.
Resumo:
We investigate performance bounds for feedback control of distributed plants where the controller can be centralized (i.e. it has access to measurements from the whole plant), but sensors only measure differences between neighboring subsystem outputs. Such "distributed sensing" can be a technological necessity in applications where system size exceeds accuracy requirements by many orders of magnitude. We formulate how distributed sensing generally limits feedback performance robust to measurement noise and to model uncertainty, without assuming any controller restrictions (among others, no "distributed control" restriction). A major practical consequence is the necessity to cut down integral action on some modes. We particularize the results to spatially invariant systems and finally illustrate implications of our developments for stabilizing the segmented primary mirror of the European Extremely Large Telescope. © 2013 Elsevier Ltd. All rights reserved.