914 resultados para Decision Processes


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work the state of the art of the automatic dialogue strategy management using Markov decision processes (MDP) with reinforcement learning (RL) is described. Partially observable Markov decision processes (POMDP) are also described. To test the validity of these methods, two spoken dialogue systems have been developed. The first one is a spoken dialogue system for weather forecast providing, and the second one is a more complex system for train information. With the first system, comparisons between a rule-based system and an automatically trained system have been done, using a real corpus to train the automatic strategy. In the second system, the scalability of these methods when used in larger systems has been tested.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decisions about noisy stimuli require evidence integration over time. Traditionally, evidence integration and decision making are described as a one-stage process: a decision is made when evidence for the presence of a stimulus crosses a threshold. Here, we show that one-stage models cannot explain psychophysical experiments on feature fusion, where two visual stimuli are presented in rapid succession. Paradoxically, the second stimulus biases decisions more strongly than the first one, contrary to predictions of one-stage models and intuition. We present a two-stage model where sensory information is integrated and buffered before it is fed into a drift diffusion process. The model is tested in a series of psychophysical experiments and explains both accuracy and reaction time distributions. © 2012 Rüter et al.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Externalizing behavior problems of 124 adolescents were assessed across Grades 7-11. In Grade 9, participants were also assessed across social-cognitive domains after imagining themselves as the object of provocations portrayed in six videotaped vignettes. Participants responded to vignette-based questions representing multiple processes of the response decision step of social information processing. Phase 1 of our investigation supported a two-factor model of the response evaluation process of response decision (response valuation and outcome expectancy). Phase 2 showed significant relations between the set of these response decision processes, as well as response selection, measured in Grade 9 and (a) externalizing behavior in Grade 9 and (b) externalizing behavior in Grades 10-11, even after controlling externalizing behavior in Grades 7-8. These findings suggest that on-line behavioral judgments about aggression play a crucial role in the maintenance and growth of aggressive response tendencies in adolescence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Markov Decision Processes (MDPs) are extensively used to encode sequences of decisions with probabilistic effects. Markov Decision Processes with Imprecise Probabilities (MDPIPs) encode sequences of decisions whose effects are modeled using sets of probability distributions. In this paper we examine the computation of Γ-maximin policies for MDPIPs using multilinear and integer programming. We discuss the application of our algorithms to “factored” models and to a recent proposal, Markov Decision Processes with Set-valued Transitions (MDPSTs), that unifies the fields of probabilistic and “nondeterministic” planning in artificial intelligence research. 

Relevância:

100.00% 100.00%

Publicador:

Resumo:

M. R. Banaji and A. G. Greenwald (1995) demonstrated a gender bias in fame judgments—that is, an increase in judged fame due to prior processing that was larger for male than for female names. They suggested that participants shift criteria between judging men and women, using the more liberal criterion for judging men. This "criterion-shift" account appeared problematic for a number of reasons. In this article, 3 experiments are reported that were designed to evaluate the criterion-shift account of the gender bias in the false-fame effect against a distribution-shift account. The results were consistent with the criterion-shift account, and they helped to define more precisely the situations in which people may be ready to shift their response criterion on an item-by-item basis. In addition, the results were incompatible with an interpretation of the criterion shift as an artifact of the experimental situation in the experiments reported by M. R. Banaji and A. G. Greenwald. (PsycINFO Database Record (c) 2010 APA, all rights reserved)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Before signing electronic contracts, a rational agent should estimate the expected utilities of these contracts and calculate the violation risks related to them. In order to perform such pre-signing procedures, this agent has to be capable of computing a policy taking into account the norms and sanctions in the contracts. In relation to this, the contribution of this work is threefold. First, we present the Normative Markov Decision Process, an extension of the Markov Decision Process for explicitly representing norms. In order to illustrate the usage of our framework, we model an example in a simulated aerospace aftermarket. Second, we specify an algorithm for identifying the states of the process which characterize the violation of norms. Finally, we show how to compute policies with our framework and how to calculate the risk of violating the norms in the contracts by adopting a particular policy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Realizing value from IT investments continues to be a challenge for most healthcare organizations. IT governance (ITG) is envisaged to solve many of these challenges. ITG is the practice that establishes accountability framework for IT investments by allocating decision rights among major participants involved in IT decision processes. As ITG is relatively new in healthcare industry, it is expected that knowledge about how healthcare organizations govern their IT decisions is limited. This research aims to extend this knowledge and to assist both researchers and professionals by providing insights on how IT decisions are made and governed in healthcare organizations (HOs). This research adopts case-study methodology to investigate IT governance in two distinctly different HOs. The research findings indicate that HOs implement ITG to achieve alignment between business objectives and IT. Both HOs set up a five-stage IT decision process to identify, evaluate and prioritize IT investment ideas. They also established generic committee-structures that clearly defined roles and decision authorities to govern such process. It is suggested here that ITG in HOs is heavily influenced by strategic priorities, organizational structure, governance experience and governmental initiatives. Effective ITG in HOs is challenged by IT alignment, policy government, involvement of healthcare executives, and lack of business metrics to justify and evaluate decisions. The research proposes recommendations to address these challenges.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper studies the average control problem of discrete-time Markov Decision Processes (MDPs for short) with general state space, Feller transition probabilities, and possibly non-compact control constraint sets A(x). Two hypotheses are considered: either the cost function c is strictly unbounded or the multifunctions A(r)(x) = {a is an element of A(x) : c(x, a) <= r} are upper-semicontinuous and compact-valued for each real r. For these two cases we provide new results for the existence of a solution to the average-cost optimality equality and inequality using the vanishing discount approach. We also study the convergence of the policy iteration approach under these conditions. It should be pointed out that we do not make any assumptions regarding the convergence and the continuity of the limit function generated by the sequence of relative difference of the alpha-discounted value functions and the Poisson equations as often encountered in the literature. (C) 2012 Elsevier Inc. All rights reserved.