21 resultados para MARKOV DECISION PROCESSES
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
Markov Decision Processes (MDPs) are extensively used to encode sequences of decisions with probabilistic effects. Markov Decision Processes with Imprecise Probabilities (MDPIPs) encode sequences of decisions whose effects are modeled using sets of probability distributions. In this paper we examine the computation of Γ-maximin policies for MDPIPs using multilinear and integer programming. We discuss the application of our algorithms to “factored” models and to a recent proposal, Markov Decision Processes with Set-valued Transitions (MDPSTs), that unifies the fields of probabilistic and “nondeterministic” planning in artificial intelligence research.
Resumo:
The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.
Dual-processes in learning and judgment:Evidence from the multiple cue probability learning paradigm
Resumo:
Multiple cue probability learning (MCPL) involves learning to predict a criterion based on a set of novel cues when feedback is provided in response to each judgment made. But to what extent does MCPL require controlled attention and explicit hypothesis testing? The results of two experiments show that this depends on cue polarity. Learning about cues that predict positively is aided by automatic cognitive processes, whereas learning about cues that predict negatively is especially demanding on controlled attention and hypothesis testing processes. In the studies reported here, negative, but not positive cue learning related to individual differences in working memory capacity both on measures of overall judgment performance and modelling of the implicit learning process. However, the introduction of a novel method to monitor participants' explicit beliefs about a set of cues on a trial-by-trial basis revealed that participants were engaged in explicit hypothesis testing about positive and negative cues, and explicit beliefs about both types of cues were linked to working memory capacity. Taken together, our results indicate that while people are engaged in explicit hypothesis testing during cue learning, explicit beliefs are applied to judgment only when cues are negative. © 2012 Elsevier Inc.
Resumo:
Child welfare professionals regularly make crucial decisions that have a significant impact on children and their families. The present study presents the Judgments and Decision Processes in Context model (JUDPIC) and uses it to examine the relationships between three indepndent domains: case characteristic (mother’s wish with regard to removal), practitioner characteristic (child welfare attitudes), and protective system context (four countries: Israel, the Netherlands, Northern Ireland and Spain); and three dependent factors: substantiation of maltreatment, risk assessment, and intervention recommendation.
The sample consisted of 828 practitioners from four countries. Participants were presented with a vignette of a case of alleged child maltreatment and were asked to determine whether maltreatment was substantiated, assess risk and recommend an intervention using structured instruments. Participants’ child welfare attitudes were assessed.
The case characteristic of mother’s wish with regard to removal had no impact on judgments and decisions. In contrast, practitioners’ child welfare attitudes were associated with substantiation, risk assessments and recommendations. There were significant country differences on most measures.
The findings support most of the predictions derived from the JUDPIC model. The significant differences between practitioners from different countries underscore the importance of context in child protection decision making. Training should enhance practitioners’ awareness of the impact that their attitudes and the context in which they are embedded have on their judgments and decisions.
Resumo:
In this paper, we investigate the remanufacturing problem of pricing single-class used products (cores) in the face of random price-dependent returns and random demand. Specifically, we propose a dynamic pricing policy for the cores and then model the problem as a continuous-time Markov decision process. Our models are designed to address three objectives: finite horizon total cost minimization, infinite horizon discounted cost, and average cost minimization. Besides proving optimal policy uniqueness and establishing monotonicity results for the infinite horizon problem, we also characterize the structures of the optimal policies, which can greatly simplify the computational procedure. Finally, we use computational examples to assess the impacts of specific parameters on optimal price and reveal the benefits of a dynamic pricing policy. © 2013 Elsevier B.V. All rights reserved.
Resumo:
In remanufacturing, the supply of used products and the demand for remanufactured products are usually mismatched because of the great uncertainties on both sides. In this paper, we propose a dynamic pricing policy to balance this uncertain supply and demand. Specifically, we study a remanufacturer’s problem of pricing a single class of cores with random price-dependent returns and random demand for the remanufactured products with backlogs. We model this pricing task as a continuous-time Markov decision process, which addresses both the finite and infinite horizon problems, and provide managerial insights by analyzing the structural properties of the optimal policy. We then use several computational examples to illustrate the impacts of particular system parameters on pricing policy.
Resumo:
Background
The use of multiple medicines (polypharmacy) is increasingly common in older people. Ensuring that patients receive the most appropriate combinations of medications (appropriate polypharmacy) is a significant challenge. The quality of evidence to support the effectiveness of interventions to improve appropriate polypharmacy is low. Systematic identification of mediators of behaviour change, using the Theoretical Domains Framework (TDF), provides a theoretically robust evidence base to inform intervention design. This study aimed to (1) identify key theoretical domains that were perceived to influence the prescribing and dispensing of appropriate polypharmacy to older patients by general practitioners (GPs) and community pharmacists, and (2) map domains to associated behaviour change techniques (BCTs) to include as components of an intervention to improve appropriate polypharmacy in older people in primary care.
Methods
Semi-structured interviews were conducted with members of each healthcare professional (HCP) group using tailored topic guides based on TDF version 1 (12 domains). Questions covering each domain explored HCPs’ perceptions of barriers and facilitators to ensuring the prescribing and dispensing of appropriate polypharmacy to older people. Interviews were audio-recorded and transcribed verbatim. Data analysis involved the framework method and content analysis. Key domains were identified and mapped to BCTs based on established methods and discussion within the research team.
Results
Thirty HCPs were interviewed (15 GPs, 15 pharmacists). Eight key domains were identified, perceived to influence prescribing and dispensing of appropriate polypharmacy: ‘Skills’, ‘Beliefs about capabilities’, ‘Beliefs about consequences’, ‘Environmental context and resources’, ‘Memory, attention and decision processes’, ‘Social/professional role and identity’, ‘Social influences’ and ‘Behavioural regulation’. Following mapping, four BCTs were selected for inclusion in an intervention for GPs or pharmacists: ‘Action planning’, ‘Prompts/cues’, ‘Modelling or demonstrating of behaviour’ and ‘Salience of consequences’. An additional BCT (‘Social support or encouragement’) was selected for inclusion in a community pharmacy-based intervention in order to address barriers relating to interprofessional working that were encountered by pharmacists.
Conclusions
Selected BCTs will be operationalised in a theory-based intervention to improve appropriate polypharmacy for older people, to be delivered in GP practice and community pharmacy settings. Future research will involve development and feasibility testing of this intervention.
Resumo:
What-if Simulations have been identified as one solution for business performance related decision support. Such support is especially useful in cases where it can be automatically generated out of Business Process Management (BPM) Environments from the existing business process models and performance parameters monitored from the executed business process instances. Currently, some of the available BPM Environments offer basic-level performance prediction capabilities. However, these functionalities are normally too limited to be generally useful for performance related decision support at business process level. In this paper, an approach is presented which allows the non-intrusive integration of sophisticated tooling for what-if simulations, analytic performance prediction tools process optimizations or a combination Of Such solutions into already existing BPM environments. The approach abstracts from process modelling techniques which enable automatic decision support spanning processes across numerous BPM Environments. For instance, this enables end-to-end decision support for composite processes modelled with the Business Process Modelling Notation (BPMN) on top of existing Enterprise Resource Planning (ERP) processes modelled with proprietary languages.
Resumo:
In order to achieve progress towards sustainable resource management, it is essential to evaluate options for the reuse and recycling of secondary raw materials, in order to provide a robust evidence base for decision makers. This paper presents the research undertaken in the development of a web-based decision-support tool (the used tyres resource efficiency tool) to compare three processing routes for used tyres compared to their existing primary alternatives. Primary data on the energy and material flows for the three routes, and their alternatives were collected and analysed. The methodology used was a streamlined life-cycle assessment (sLCA) approach. Processes included were: car tyre baling against aggregate gabions; car tyre retreading against new car tyres; and car tyre shred used in landfill engineering against primary aggregates. The outputs of the assessment, and web-based tool, were estimates of raw materials used, carbon dioxide emissions and costs. The paper discusses the benefits of carrying out a streamlined LCA and using the outputs of this analysis to develop a decision-support tool. The strengths and weakness of this approach are discussed and future research priorities identified which could facilitate the use of life cycle approaches by designers and practitioners.
Resumo:
Research on surgical decision making and risk management usually focuses on peri-operative care, despite the magnitude and frequency of intra-operative risks. The aim of this study was to examine ophthalmic surgeons' intra-operative decisions and risk management strategies in order to explore differences in cognitive processes.
Resumo:
The article examines why a comprehensive settlement to resolve the Cyprus problem has yet to be reached despite the existence of a positive incentive structure and the proactive involvement of regional and international organizations, including the European Union and the United Nations. To address this question, evidence from critical turning points in foreign policy decision-making in Turkey, Greece and the two communities in Cyprus is drawn on. The role of hegemonic political discourses is emphasized, and it is argued that the latter have prevented an accurate evaluation of incentives that could have set the stage for a constructive settlement. However, despite the political debacle in the Cypriot negotiations, success stories have emerged, such as the reactivation of the Committee for Missing Persons (CMP), a defunct body for almost 25 years, to become the most successful bi-communal project following Cyprus’s EU accession. Contradictory evidence in the Cypriot peace process is evaluated and policy lessons to be learned from the CMP ‘success story’ are identified.