43 resultados para semi-Markov decision process
Resumo:
Markov Decision Processes (MDPs) are extensively used to encode sequences of decisions with probabilistic effects. Markov Decision Processes with Imprecise Probabilities (MDPIPs) encode sequences of decisions whose effects are modeled using sets of probability distributions. In this paper we examine the computation of Γ-maximin policies for MDPIPs using multilinear and integer programming. We discuss the application of our algorithms to “factored” models and to a recent proposal, Markov Decision Processes with Set-valued Transitions (MDPSTs), that unifies the fields of probabilistic and “nondeterministic” planning in artificial intelligence research.
Resumo:
In this paper, we investigate the remanufacturing problem of pricing single-class used products (cores) in the face of random price-dependent returns and random demand. Specifically, we propose a dynamic pricing policy for the cores and then model the problem as a continuous-time Markov decision process. Our models are designed to address three objectives: finite horizon total cost minimization, infinite horizon discounted cost, and average cost minimization. Besides proving optimal policy uniqueness and establishing monotonicity results for the infinite horizon problem, we also characterize the structures of the optimal policies, which can greatly simplify the computational procedure. Finally, we use computational examples to assess the impacts of specific parameters on optimal price and reveal the benefits of a dynamic pricing policy. © 2013 Elsevier B.V. All rights reserved.
Resumo:
In remanufacturing, the supply of used products and the demand for remanufactured products are usually mismatched because of the great uncertainties on both sides. In this paper, we propose a dynamic pricing policy to balance this uncertain supply and demand. Specifically, we study a remanufacturer’s problem of pricing a single class of cores with random price-dependent returns and random demand for the remanufactured products with backlogs. We model this pricing task as a continuous-time Markov decision process, which addresses both the finite and infinite horizon problems, and provide managerial insights by analyzing the structural properties of the optimal policy. We then use several computational examples to illustrate the impacts of particular system parameters on pricing policy.
Resumo:
This article reports results of an experiment designed to analyze the link between risky decisions made by couples and risky decisions made separately by each spouse. We estimate both the spouses and the couples' degrees of risk aversion, we assess how the risk preferences of the two spouses aggregate when they make risky decisions, and we shed light on the dynamics of the decision process that takes place when couples make risky decisions. We find that, far from being fixed, the balance of power within the household is malleable. In most couples, men have, initially, more decision-making power than women but women who ultimately implement the joint decisions gain more and more power over the course of decision making.
Resumo:
Hypothetical contingent valuation surveys used to elicit values for environmental and other public goods often employ variants of the referendum mechanism due to the cognitive simplicity and familiarity of respondents with this voting format. One variant, the double referendum mechanism, requires respondents to state twice how they would vote for a given policy proposal given their cost of the good. Data from these surveys often exhibit anomalies inconsistent with standard economic models of consumer preferences. There are a number of published explanations for these anomalies, mostly focusing on problems with the second vote. This article investigates which aspects of the hypothetical task affect the degree of nondemand revelation and takes an individual-based approach to identifying people most likely to non-demand reveal. A clear profile emerges from our model of a person who faces a negative surplus i.e. a net loss in the second vote and invokes non self-interested, non financial motivations during the decision process.
Resumo:
Making a decision is often a matter of listing and comparing positive and negative arguments. In such cases, the evaluation scale for decisions should be considered bipolar, that is, negative and positive values should be explicitly distinguished. That is what is done, for example, in Cumulative Prospect Theory. However, contrary to the latter framework that presupposes genuine numerical assessments, human agents often decide on the basis of an ordinal ranking of the pros and the cons, and by focusing on the most salient arguments. In other terms, the decision process is qualitative as well as bipolar. In this article, based on a bipolar extension of possibility theory, we define and axiomatically characterize several decision rules tailored for the joint handling of positive and negative arguments in an ordinal setting. The simplest rules can be viewed as extensions of the maximin and maximax criteria to the bipolar case, and consequently suffer from poor decisive power. More decisive rules that refine the former are also proposed. These refinements agree both with principles of efficiency and with the spirit of order-of-magnitude reasoning, that prevails in qualitative decision theory. The most refined decision rule uses leximin rankings of the pros and the cons, and the ideas of counting arguments of equal strength and cancelling pros by cons. It is shown to come down to a special case of Cumulative Prospect Theory, and to subsume the “Take the Best” heuristic studied by cognitive psychologists.
Resumo:
While the repeated nature of Discrete Choice Experiments is advantageous from a sampling efficiency perspective, patterns of choice may differ across the tasks, due, in part, to learning and fatigue. Using probabilistic decision process models, we find in a field study that learning and fatigue behavior may only be exhibited by a small subset of respondents. Most respondents in our sample show preference and variance stability consistent with rational pre-existent and
well formed preferences. Nearly all of the remainder exhibit both learning and fatigue effects. An important aspect of our approach is that it enables learning and fatigue effects to be explored, even though they were not envisaged during survey design or data collection.
Resumo:
The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.
Resumo:
There are established migrant reasons to explain rural in-migration. These include quality of life, rural idyll and lifestyle motivations. However, such one-dimensional sound bites portray rural in-migration in overly simplistic and stereotypical terms. In contrast, this paper distinguishes the decision to move from the reason for moving and in doing so sheds new light on the interconnections between different domains (family, work, finance, health) of the migrant's life which contribute to migration behaviour. Focussing on early retirees to mid-Wales and adopting a life course perspective the overall decision to move is disaggregated into a series of decisions. Giving voices to the migrants themselves demonstrates the combination of life events necessary to lead to migration behaviour, the variable factors (and often economic dominance) considered in the choice of destination (including that many are reluctant migrants to Wales), and the perceived 'accidental' choice of location and/or property. It is argued that quality of life, rural idyll and lifestyle sound bites offer an inadequate understanding of rural in-migration and associated decision-making processes. Moreover, they disguise the true nature of migrant decision making.