9 resultados para Action Learning Cycle
em Cambridge University Engineering Department Publications Database
Resumo:
This paper reflects on the motivation, method and effectiveness of teaching leadership and organisational change to graduate engineers. Delivering progress towards sustainable development requires engineers who are aware of pressing global issues (such as resource depletion, climate change, social inequity and an interdependent economy) since it is they who deliver the goods and services that underpin society within these constraints. In recognition of this fact the Cambridge University MPhil in Engineering for Sustainable Development has focussed on educating engineers to become effective change agents in their professional field with the confidence to challenge orthodoxy in adopting traditional engineering solutions. This paper reflects on ten years of delivering this course to review how teaching change management and leadership aspects of the programme have evolved and progressed over that time. As the students on this professional practice have often extensive experience as practising engineers and scientists, they have learned the limitations of their technical background when solving complex problems. Students often join the course recognising their need to broaden their knowledge of relevant cross-disciplinary skills. The course offers an opportunity for these early to mid-career engineers to explore an ethical and value-based approach to bringing about effective change in their particular sectors and organisations. This is achieved through action learning assignments in combination with reflections on the theory of change to enable students to equip themselves with tools that help them to be effective in making their professional and personal life choices. This paper draws on feedback gathered from students during their participation on the course and augments this with alumni reflections gathered some years after their graduation. These professionals are able to look back on their experience of the taught components and reflect on how they have been able to apply this key learning in their subsequent careers.
Resumo:
This paper reflects on the motivation, method and effectiveness of teaching leadership and organisational change to graduate engineers. Delivering progress towards sustainable development requires engineers who are aware of pressing global issues (such as resource depletion, climate change, social inequity and an interdependent economy) since it is they who deliver the goods and services that underpin society within these constraints. They also must understand how to implement change in the organisations within which they will work. In recognition of this fact the Cambridge University MPhil in Engineering for Sustainable Development has focussed on educating engineers to become effective change agents in their professional field with the confidence to challenge orthodoxy in adopting traditional engineering solutions. This paper reflects on ten years of delivering a special module to review how teaching change management and leadership aspects of the programme have evolved and progressed over that time. As the students who embark on this professional practice have often extensive experience as practising engineers and scientists, many have already learned the limitations of their technical background when solving complex problems. Students often join the course recognising their need to broaden their knowledge of relevant cross-disciplinary skills. The programme offers an opportunity for these early to mid-career engineers to explore an ethical and value-based approach to bringing about effective change in their particular sectors and organisations. This is achieved through action learning assignments in combination with reflections on the theory of change to enable students to equip themselves with tools that help them to be effective in making their professional and personal life choices. This paper draws on feedback gathered from students during their participation on the programme and augments this with alumni reflections gathered some years after their graduation. These professionals are able to look back on their experience of the taught components and reflect on how they have been able to apply this key learning in their subsequent careers. Copyright © 2012 September.
Resumo:
Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the subjects. This can be seen in both experimental and natural circumstances. In this paper we study the consequences of importing this competition into a reinforcement learning context, and demonstrate the resulting effects in an omission schedule and a maze navigation task. The misbehavior created by Pavlovian values can be quite debilitating; we discuss how it may be disciplined.
Resumo:
'Learning to learn' phenomena have been widely investigated in cognition, perception and more recently also in action. During concept learning tasks, for example, it has been suggested that characteristic features are abstracted from a set of examples with the consequence that learning of similar tasks is facilitated-a process termed 'learning to learn'. From a computational point of view such an extraction of invariants can be regarded as learning of an underlying structure. Here we review the evidence for structure learning as a 'learning to learn' mechanism, especially in sensorimotor control where the motor system has to adapt to variable environments. We review studies demonstrating that common features of variable environments are extracted during sensorimotor learning and exploited for efficient adaptation in novel tasks. We conclude that structure learning plays a fundamental role in skill learning and may underlie the unsurpassed flexibility and adaptability of the motor system.
Resumo:
This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
A novel framework is provided for very fast model-based reinforcement learning in continuous state and action spaces. It requires probabilistic models that explicitly characterize their levels of condence. Within the framework, exible, non-parametric models are used to describe the world based on previously collected experience. It demonstrates learning on the cart-pole problem in a setting where very limited prior knowledge about the task has been provided. Learning progressed rapidly, and a good policy found after only a small number of iterations.
Resumo:
Common-rail fuel injection systems on modern light duty diesel engines are effectively able to respond instantaneously to changes in the demanded injection quantity. In contrast, the air-system is subject to significantly slower dynamics, primarily due to filling/emptying effects in the manifolds and turbocharger inertia. The behaviour of the air-path in a diesel engine is therefore the main limiting factor in terms of engine-out emissions during transient operation. This paper presents a simple mean-value model for the air-path during throttled operation, which is used to design a feed-forward controller that delivers very rapid changes in the in-cylinder charge properties. The feed-forward control action is validated using a state-of-the-art sampling system that allows true cycle-by-cycle measurement of the in-cylinder CO2 concentration. © 2011 SAE International.
Resumo:
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic plasticity. On the other hand, the theory of reinforcement learning provides a framework for reward-based learning. Recent models of reward-modulated spike-timing-dependent plasticity have made first steps towards bridging the gap between the two approaches, but faced two problems. First, reinforcement learning is typically formulated in a discrete framework, ill-adapted to the description of natural situations. Second, biologically plausible models of reward-modulated spike-timing-dependent plasticity require precise calculation of the reward prediction error, yet it remains to be shown how this can be computed by neurons. Here we propose a solution to these problems by extending the continuous temporal difference (TD) learning of Doya (2000) to the case of spiking neurons in an actor-critic network operating in continuous time, and with continuous state and action representations. In our model, the critic learns to predict expected future rewards in real time. Its activity, together with actual rewards, conditions the delivery of a neuromodulatory TD signal to itself and to the actor, which is responsible for action choice. In simulations, we show that such an architecture can solve a Morris water-maze-like navigation task, in a number of trials consistent with reported animal performance. We also use our model to solve the acrobot and the cartpole problems, two complex motor control tasks. Our model provides a plausible way of computing reward prediction error in the brain. Moreover, the analytically derived learning rule is consistent with experimental evidence for dopamine-modulated spike-timing-dependent plasticity.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.