23 resultados para Belief.
em Cambridge University Engineering Department Publications Database
Resumo:
This article presents a novel algorithm for learning parameters in statistical dialogue systems which are modeled as Partially Observable Markov Decision Processes (POMDPs). The three main components of a POMDP dialogue manager are a dialogue model representing dialogue state information; a policy that selects the system's responses based on the inferred state; and a reward function that specifies the desired behavior of the system. Ideally both the model parameters and the policy would be designed to maximize the cumulative reward. However, while there are many techniques available for learning the optimal policy, no good ways of learning the optimal model parameters that scale to real-world dialogue systems have been found yet. The presented algorithm, called the Natural Actor and Belief Critic (NABC), is a policy gradient method that offers a solution to this problem. Based on observed rewards, the algorithm estimates the natural gradient of the expected cumulative reward. The resulting gradient is then used to adapt both the prior distribution of the dialogue model parameters and the policy parameters. In addition, the article presents a variant of the NABC algorithm, called the Natural Belief Critic (NBC), which assumes that the policy is fixed and only the model parameters need to be estimated. The algorithms are evaluated on a spoken dialogue system in the tourist information domain. The experiments show that model parameters estimated to maximize the expected cumulative reward result in significantly improved performance compared to the baseline hand-crafted model parameters. The algorithms are also compared to optimization techniques using plain gradients and state-of-the-art random search algorithms. In all cases, the algorithms based on the natural gradient work significantly better. © 2011 ACM.
Resumo:
Humans have the arguably unique ability to understand the mental representations of others. For success in both competitive and cooperative interactions, however, this ability must be extended to include representations of others' belief about our intentions, their model about our belief about their intentions, and so on. We developed a "stag hunt" game in which human subjects interacted with a computerized agent using different degrees of sophistication (recursive inferences) and applied an ecologically valid computational model of dynamic belief inference. We show that rostral medial prefrontal (paracingulate) cortex, a brain region consistently identified in psychological tasks requiring mentalizing, has a specific role in encoding the uncertainty of inference about the other's strategy. In contrast, dorsolateral prefrontal cortex encodes the depth of recursion of the strategy being used, an index of executive sophistication. These findings reveal putative computational representations within prefrontal cortex regions, supporting the maintenance of cooperation in complex social decision making.
Resumo:
Many problems in control and signal processing can be formulated as sequential decision problems for general state space models. However, except for some simple models one cannot obtain analytical solutions and has to resort to approximation. In this thesis, we have investigated problems where Sequential Monte Carlo (SMC) methods can be combined with a gradient based search to provide solutions to online optimisation problems. We summarise the main contributions of the thesis as follows. Chapter 4 focuses on solving the sensor scheduling problem when cast as a controlled Hidden Markov Model. We consider the case in which the state, observation and action spaces are continuous. This general case is important as it is the natural framework for many applications. In sensor scheduling, our aim is to minimise the variance of the estimation error of the hidden state with respect to the action sequence. We present a novel SMC method that uses a stochastic gradient algorithm to find optimal actions. This is in contrast to existing works in the literature that only solve approximations to the original problem. In Chapter 5 we presented how an SMC can be used to solve a risk sensitive control problem. We adopt the use of the Feynman-Kac representation of a controlled Markov chain flow and exploit the properties of the logarithmic Lyapunov exponent, which lead to a policy gradient solution for the parameterised problem. The resulting SMC algorithm follows a similar structure with the Recursive Maximum Likelihood(RML) algorithm for online parameter estimation. In Chapters 6, 7 and 8, dynamic Graphical models were combined with with state space models for the purpose of online decentralised inference. We have concentrated more on the distributed parameter estimation problem using two Maximum Likelihood techniques, namely Recursive Maximum Likelihood (RML) and Expectation Maximization (EM). The resulting algorithms can be interpreted as an extension of the Belief Propagation (BP) algorithm to compute likelihood gradients. In order to design an SMC algorithm, in Chapter 8 uses a nonparametric approximations for Belief Propagation. The algorithms were successfully applied to solve the sensor localisation problem for sensor networks of small and medium size.
Resumo:
This paper argues that the widespread belief that ambiguity is beneficial in design communication stems from conceptual confusion. Communicating imprecise, uncertain and provisional ideas is a vital part of design teamwork, but what is uncertain and provisional needs to be expressed as clearly as possible. This paper argues that viewing design communication as conveying permitted spaces for further designing is a useful rationalisation for understanding what designers need from their notations and computer tools, to achieve clear communication of uncertain ideas. The paper presents a typology of ways that designs can be uncertain. It discusses how sketches and other representations of designs can be both intrinsically ambiguous, and ambiguous or misleading by failing to convey information about uncertainty and provisionality, with reference to knitwear design, where communication using inadequate representations causes severe problems. It concludes that systematic use of meta-notations for conveying provisionality and uncertainty can reduce these problems.
Resumo:
There is a widespread recognition of the need for better information sharing and provision to improve the viability of end-of-life (EOL) product recovery operations. The emergence of automated data capture and sharing technologies such as RFID, sensors and networked databases has enhanced the ability to make product information; available to recoverers, which will help them make better decisions regarding the choice of recovery option for EOL products. However, these technologies come with a cost attached to it, and hence the question 'what is its value?' is critical. This paper presents a probabilistic approach to model product recovery decisions and extends the concept of Bayes' factor for quantifying the impact of product information on the effectiveness of these decisions. Further, we provide a quantitative examination of the factors that influence the value of product information, this value depends on three factors: (i) penalties for Type I and Type II errors of judgement regarding product quality; (ii) prevalent uncertainty regarding product quality and (iii) the strength of the information to support/contradict the belief. Furthermore, we show that information is not valuable under all circumstances and derive conditions for achieving a positive value of information. © 2010 Taylor & Francis.
Resumo:
Deep belief networks are a powerful way to model complex probability distributions. However, learning the structure of a belief network, particularly one with hidden units, is difficult. The Indian buffet process has been used as a nonparametric Bayesian prior on the directed structure of a belief network with a single infinitely wide hidden layer. In this paper, we introduce the cascading Indian buffet process (CIBP), which provides a nonparametric prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network so each unit can additionally vary its behavior between discrete and continuous representations. We provide Markov chain Monte Carlo algorithms for inference in these belief networks and explore the structures learned on several image data sets.
Resumo:
This talk describes a new version of the Multivariable Frequency Domain Toolbox for Matlab. The intellectual issue which arises here is whether there is a role for Matlab-4 GUI facilities in a Toolbox which provides relatively low-level functionality, with a correspondingly random pattern of user interaction. My belief is that there is a role, but it is very restricted: in effect only for providing convenient 'viewing' facilities for low-level objects (which are multivariable frequency responses in the case of the MFD Toolbox). There is a more obvious role for a GUI with higher-level functions, such as frequency domain identification or parametric controller optimisation.
Resumo:
Purpose: The purpose of this paper is to present an exception to the common belief "If you can't measure it, you can't manage it". It aims to show how in certain situations particular practices, attitudes and cultures can remove the need for individual performance measurement. Design/methodology/approach: First, the paper identifies the usual roles of performance measurement in managing individual employees as described by control and motivation theorists. Second, it identifies a market-leading organisation where managers deliberately refuse to use their top-level performance measurement system to manage the performance of individual employees. A case study is carried out to test what non-measurement mechanisms fulfil the roles of individual performance measurement in this organisation. Findings: Building on situations observed at this company, a set of possible characteristics of companies that do not require formalised individual performance measurement systems in order to achieve high performance standards is put forward. Practical implications: Managers should not always assume that individual performance measurement is the only way to achieve excellent performance. This study shows that, by granting responsibilities and providing appropriate support, managers can channel workers' enhanced motivation towards meeting wider organisational goals. Originality/value: This work broadens the understanding of how excellent performance can be achieved. It shows that excellence can be achieved through practices based on shared values linked to motivation, trust, and a common sense of mission, without the need to install individual performance measurement systems based on cybernetic principles. © Emerald Group Publishing Limited.