5 resultados para middle manager
em Cambridge University Engineering Department Publications Database
Resumo:
Effective dialogue management is critically dependent on the information that is encoded in the dialogue state. In order to deploy reinforcement learning for policy optimization, dialogue must be modeled as a Markov Decision Process. This requires that the dialogue statemust encode all relevent information obtained during the dialogue prior to that state. This can be achieved by combining the user goal, the dialogue history, and the last user action to form the dialogue state. In addition, to gain robustness to input errors, dialogue must be modeled as a Partially Observable Markov Decision Process (POMDP) and hence, a distribution over all possible states must be maintained at every dialogue turn. This poses a potential computational limitation since there can be a very large number of dialogue states. The Hidden Information State model provides a principled way of ensuring tractability in a POMDP-based dialogue model. The key feature of this model is the grouping of user goals into partitions that are dynamically built during the dialogue. In this article, we extend this model further to incorporate the notion of complements. This allows for a more complex user goal to be represented, and it enables an effective pruning technique to be implemented that preserves the overall system performance within a limited computational resource more effectively than existing approaches. © 2011 ACM.
Resumo:
In contrast to the wealth of data describing the neural mechanisms underlying classical conditioning, we know remarkably little about the mechanisms involved in acquisition of explicit contingency awareness. Subjects variably acquire contingency awareness in classical conditioning paradigms, in which they are able to describe the temporal relationship between a conditioned cue and its outcome. Previous studies have implicated the hippocampus and prefrontal cortex in the acquisition of explicit knowledge, although their specific roles remain unclear. We used functional magnetic resonance imaging to track the trial-by-trial acquisition of explicit knowledge in a concurrent trace and delay conditioning paradigm. We show that activity in bilateral middle frontal gyrus and parahippocampal gyrus correlates with the accuracy of explicit contingency awareness on each trial. In contrast, amygdala activation correlates with conditioned responses indexed by skin conductance responses (SCRs). These results demonstrate that brain regions known to be involved in other aspects of learning and memory also play a specific role, reflecting on each trial the acquisition and representation of contingency awareness.
Resumo:
A partially observable Markov decision process (POMDP) has been proposed as a dialog model that enables automatic optimization of the dialog policy and provides robustness to speech understanding errors. Various approximations allow such a model to be used for building real-world dialog systems. However, they require a large number of dialogs to train the dialog policy and hence they typically rely on the availability of a user simulator. They also require significant designer effort to hand-craft the policy representation. We investigate the use of Gaussian processes (GPs) in policy modeling to overcome these problems. We show that GP policy optimization can be implemented for a real world POMDP dialog manager, and in particular: 1) we examine different formulations of a GP policy to minimize variability in the learning process; 2) we find that the use of GP increases the learning rate by an order of magnitude thereby allowing learning by direct interaction with human users; and 3) we demonstrate that designer effort can be substantially reduced by basing the policy directly on the full belief space thereby avoiding ad hoc feature space modeling. Overall, the GP approach represents an important step forward towards fully automatic dialog policy optimization in real world systems. © 2013 IEEE.