31 resultados para Markov Decision Process
Resumo:
Markov Decision Processes (MDPs) are extensively used to encode sequences of decisions with probabilistic effects. Markov Decision Processes with Imprecise Probabilities (MDPIPs) encode sequences of decisions whose effects are modeled using sets of probability distributions. In this paper we examine the computation of Γ-maximin policies for MDPIPs using multilinear and integer programming. We discuss the application of our algorithms to “factored” models and to a recent proposal, Markov Decision Processes with Set-valued Transitions (MDPSTs), that unifies the fields of probabilistic and “nondeterministic” planning in artificial intelligence research.
Resumo:
In this paper, we investigate the remanufacturing problem of pricing single-class used products (cores) in the face of random price-dependent returns and random demand. Specifically, we propose a dynamic pricing policy for the cores and then model the problem as a continuous-time Markov decision process. Our models are designed to address three objectives: finite horizon total cost minimization, infinite horizon discounted cost, and average cost minimization. Besides proving optimal policy uniqueness and establishing monotonicity results for the infinite horizon problem, we also characterize the structures of the optimal policies, which can greatly simplify the computational procedure. Finally, we use computational examples to assess the impacts of specific parameters on optimal price and reveal the benefits of a dynamic pricing policy. © 2013 Elsevier B.V. All rights reserved.
Resumo:
In remanufacturing, the supply of used products and the demand for remanufactured products are usually mismatched because of the great uncertainties on both sides. In this paper, we propose a dynamic pricing policy to balance this uncertain supply and demand. Specifically, we study a remanufacturer’s problem of pricing a single class of cores with random price-dependent returns and random demand for the remanufactured products with backlogs. We model this pricing task as a continuous-time Markov decision process, which addresses both the finite and infinite horizon problems, and provide managerial insights by analyzing the structural properties of the optimal policy. We then use several computational examples to illustrate the impacts of particular system parameters on pricing policy.
Resumo:
This article reports results of an experiment designed to analyze the link between risky decisions made by couples and risky decisions made separately by each spouse. We estimate both the spouses and the couples' degrees of risk aversion, we assess how the risk preferences of the two spouses aggregate when they make risky decisions, and we shed light on the dynamics of the decision process that takes place when couples make risky decisions. We find that, far from being fixed, the balance of power within the household is malleable. In most couples, men have, initially, more decision-making power than women but women who ultimately implement the joint decisions gain more and more power over the course of decision making.
Resumo:
Hypothetical contingent valuation surveys used to elicit values for environmental and other public goods often employ variants of the referendum mechanism due to the cognitive simplicity and familiarity of respondents with this voting format. One variant, the double referendum mechanism, requires respondents to state twice how they would vote for a given policy proposal given their cost of the good. Data from these surveys often exhibit anomalies inconsistent with standard economic models of consumer preferences. There are a number of published explanations for these anomalies, mostly focusing on problems with the second vote. This article investigates which aspects of the hypothetical task affect the degree of nondemand revelation and takes an individual-based approach to identifying people most likely to non-demand reveal. A clear profile emerges from our model of a person who faces a negative surplus i.e. a net loss in the second vote and invokes non self-interested, non financial motivations during the decision process.
Resumo:
Making a decision is often a matter of listing and comparing positive and negative arguments. In such cases, the evaluation scale for decisions should be considered bipolar, that is, negative and positive values should be explicitly distinguished. That is what is done, for example, in Cumulative Prospect Theory. However, contrary to the latter framework that presupposes genuine numerical assessments, human agents often decide on the basis of an ordinal ranking of the pros and the cons, and by focusing on the most salient arguments. In other terms, the decision process is qualitative as well as bipolar. In this article, based on a bipolar extension of possibility theory, we define and axiomatically characterize several decision rules tailored for the joint handling of positive and negative arguments in an ordinal setting. The simplest rules can be viewed as extensions of the maximin and maximax criteria to the bipolar case, and consequently suffer from poor decisive power. More decisive rules that refine the former are also proposed. These refinements agree both with principles of efficiency and with the spirit of order-of-magnitude reasoning, that prevails in qualitative decision theory. The most refined decision rule uses leximin rankings of the pros and the cons, and the ideas of counting arguments of equal strength and cancelling pros by cons. It is shown to come down to a special case of Cumulative Prospect Theory, and to subsume the “Take the Best” heuristic studied by cognitive psychologists.
Resumo:
While the repeated nature of Discrete Choice Experiments is advantageous from a sampling efficiency perspective, patterns of choice may differ across the tasks, due, in part, to learning and fatigue. Using probabilistic decision process models, we find in a field study that learning and fatigue behavior may only be exhibited by a small subset of respondents. Most respondents in our sample show preference and variance stability consistent with rational pre-existent and
well formed preferences. Nearly all of the remainder exhibit both learning and fatigue effects. An important aspect of our approach is that it enables learning and fatigue effects to be explored, even though they were not envisaged during survey design or data collection.
Resumo:
The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.
Resumo:
There are established migrant reasons to explain rural in-migration. These include quality of life, rural idyll and lifestyle motivations. However, such one-dimensional sound bites portray rural in-migration in overly simplistic and stereotypical terms. In contrast, this paper distinguishes the decision to move from the reason for moving and in doing so sheds new light on the interconnections between different domains (family, work, finance, health) of the migrant's life which contribute to migration behaviour. Focussing on early retirees to mid-Wales and adopting a life course perspective the overall decision to move is disaggregated into a series of decisions. Giving voices to the migrants themselves demonstrates the combination of life events necessary to lead to migration behaviour, the variable factors (and often economic dominance) considered in the choice of destination (including that many are reluctant migrants to Wales), and the perceived 'accidental' choice of location and/or property. It is argued that quality of life, rural idyll and lifestyle sound bites offer an inadequate understanding of rural in-migration and associated decision-making processes. Moreover, they disguise the true nature of migrant decision making.
Resumo:
Previous studies have revealed considerable interobserver and intraobserver variation in the histological classification of preinvasive cervical squamous lesions. The aim of the present study was to develop a decision support system (DSS) for the histological interpretation of these lesions. Knowledge and uncertainty were represented in the form of a Bayesian belief network that permitted the storage of diagnostic knowledge and, for a given case, the collection of evidence in a cumulative manner that provided a final probability for the possible diagnostic outcomes. The network comprised 8 diagnostic histological features (evidence nodes) that were each independently linked to the diagnosis (decision node) by a conditional probability matrix. Diagnostic outcomes comprised normal; koilocytosis; and cervical intraepithelial neoplasia (CIN) 1, CIN II, and CIN M. For each evidence feature, a set of images was recorded that represented the full spectrum of change for that feature. The system was designed to be interactive in that the histopathologist was prompted to enter evidence into the network via a specifically designed graphical user interface (i-Path Diagnostics, Belfast, Northern Ireland). Membership functions were used to derive the relative likelihoods for the alternative feature outcomes, the likelihood vector was entered into the network, and the updated diagnostic belief was computed for the diagnostic outcomes and displayed. A cumulative probability graph was generated throughout the diagnostic process and presented on screen. The network was tested on 50 cervical colposcopic biopsy specimens, comprising 10 cases each of normal, koilocytosis, CIN 1, CIN H, and CIN III. These had been preselected by a consultant gynecological pathologist. Using conventional morphological assessment, the cases were classified on 2 separate occasions by 2 consultant and 2 junior pathologists. The cases were also then classified using the DSS on 2 occasions by the 4 pathologists and by 2 medical students with no experience in cervical histology. Interobserver and intraobserver agreement using morphology and using the DSS was calculated with K statistics. Intraobserver reproducibility using conventional unaided diagnosis was reasonably good (kappa range, 0.688 to 0.861), but interobserver agreement was poor (kappa range, 0.347 to 0.747). Using the DSS improved overall reproducibility between individuals. Using the DSS, however, did not enhance the diagnostic performance of junior pathologists when comparing their DSS-based diagnosis against an experienced consultant. However, the generation of a cumulative probability graph also allowed a comparison of individual performance, how individual features were assessed in the same case, and how this contributed to diagnostic disagreement between individuals. Diagnostic features such as nuclear pleomorphism were shown to be particularly problematic and poorly reproducible. DSSs such as this therefore not only have a role to play in enhancing decision making but also in the study of diagnostic protocol, education, self-assessment, and quality control. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
This is a study of the processes for freeing children for adoption in Northern Ireland. The focus was the time taken from admission to care to adoption order. The findings confirmed that the process is dogged by delay at each stage. In total the average time from the child becoming looked after to the granting of an adoption order was 4.5 years. Most of the time taken was in the stages for which social services had lead responsibility, principally the decision to pursue adoption as the plan for a child. The children were very young when admitted to care - average age 1 year 7 months. Most were admitted to care because they were being neglected. Their parents were well known to social services and had multiple problems. Most parents unsuccessfully contested the social services' application and this contributed much to the delay. Their former foster parents adopted almost half of the children and these children tended to be placed more quickly with their adopters than those placed with adopters who were not their foster parents prior to the adoption process.
Resumo:
The traditional planning process in the UK and elsewhere takes too long to develop, are demanding on resources that are scarce and most times tend to be unrelated to the needs and demands of society. It segregates the plan making from the decision making process with the consultants planning, the politicians deciding and the community receiving without being integrated into the planning and decision making process. The Scottish Planning system is undergoing radical changes as evidenced by the publication of the Planning Advice Note, PAN by the Scottish Executive in July 2006 with the aim of enabling Community Engagement that allow for openness and accountability in the decision making process. The Public Engagement is a process that is driven by the physical, social and economic systems research aimed at improving the process at the level of community through problem solving and of the city region through strategic planning. There are several methods available to engage the community in large scale projects. The two well known ones are the Enquiry be Design and the Charrette approaches used in the UK and US respectively. This paper is an independent and rigorous analysis of the Charrette process as observed in the proposed Tornagrain Settlement in the Highlands area of Scotland. It attempts to gauge and analyse the attitudes, perceptions of the participants the Charrette as well as the mechanics and structure of the Charrette. The study analyzes the Charrette approach as a method future public engagement in and its effectiveness within the Scottish Planning System in view of PAN 2005. The analysis revealed that the Charrette as a method of engagement could be effective in changing attitudes of the community to the design process under certain conditions as discussed in the paper.