942 resultados para Markov decision processes


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper details the development and evaluation of AstonTAC, an energy broker that successfully participated in the 2012 Power Trading Agent Competition (Power TAC). AstonTAC buys electrical energy from the wholesale market and sells it in the retail market. The main focus of the paper is on the broker’s bidding strategy in the wholesale market. In particular, it employs Markov Decision Processes (MDP) to purchase energy at low prices in a day-ahead power wholesale market, and keeps energy supply and demand balanced. Moreover, we explain how the agent uses Non-Homogeneous Hidden Markov Model (NHHMM) to forecast energy demand and price. An evaluation and analysis of the 2012 Power TAC finals show that AstonTAC is the only agent that can buy energy at low price in the wholesale market and keep energy imbalance low.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing is a new technological paradigm offering computing infrastructure, software and platforms as a pay-as-you-go, subscription-based service. Many potential customers of cloud services require essential cost assessments to be undertaken before transitioning to the cloud. Current assessment techniques are imprecise as they rely on simplified specifications of resource requirements that fail to account for probabilistic variations in usage. In this paper, we address these problems and propose a new probabilistic pattern modelling (PPM) approach to cloud costing and resource usage verification. Our approach is based on a concise expression of probabilistic resource usage patterns translated to Markov decision processes (MDPs). Key costing and usage queries are identified and expressed in a probabilistic variant of temporal logic and calculated to a high degree of precision using quantitative verification techniques. The PPM cost assessment approach has been implemented as a Java library and validated with a case study and scalability experiments. © 2012 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The challenge of detecting a change in the distribution of data is a sequential decision problem that is relevant to many engineering solutions, including quality control and machine and process monitoring. This dissertation develops techniques for exact solution of change-detection problems with discrete time and discrete observations. Change-detection problems are classified as Bayes or minimax based on the availability of information on the change-time distribution. A Bayes optimal solution uses prior information about the distribution of the change time to minimize the expected cost, whereas a minimax optimal solution minimizes the cost under the worst-case change-time distribution. Both types of problems are addressed. The most important result of the dissertation is the development of a polynomial-time algorithm for the solution of important classes of Markov Bayes change-detection problems. Existing techniques for epsilon-exact solution of partially observable Markov decision processes have complexity exponential in the number of observation symbols. A new algorithm, called constellation induction, exploits the concavity and Lipschitz continuity of the value function, and has complexity polynomial in the number of observation symbols. It is shown that change-detection problems with a geometric change-time distribution and identically- and independently-distributed observations before and after the change are solvable in polynomial time. Also, change-detection problems on hidden Markov models with a fixed number of recurrent states are solvable in polynomial time. A detailed implementation and analysis of the constellation-induction algorithm are provided. Exact solution methods are also established for several types of minimax change-detection problems. Finite-horizon problems with arbitrary observation distributions are modeled as extensive-form games and solved using linear programs. Infinite-horizon problems with linear penalty for detection delay and identically- and independently-distributed observations can be solved in polynomial time via epsilon-optimal parameterization of a cumulative-sum procedure. Finally, the properties of policies for change-detection problems are described and analyzed. Simple classes of formal languages are shown to be sufficient for epsilon-exact solution of change-detection problems, and methods for finding minimally sized policy representations are described.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the long run average continuous control problem of piecewise deterministic Markov processes (PDMPs) taking values in a general Borel space and with compact action space depending on the state variable. The control variable acts on the jump rate and transition measure of the PDMP, and the running and boundary costs are assumed to be positive but not necessarily bounded. Our first main result is to obtain an optimality equation for the long run average cost in terms of a discrete-time optimality equation related to the embedded Markov chain given by the postjump location of the PDMP. Our second main result guarantees the existence of a feedback measurable selector for the discrete-time optimality equation by establishing a connection between this equation and an integro-differential equation. Our final main result is to obtain some sufficient conditions for the existence of a solution for a discrete-time optimality inequality and an ordinary optimal feedback control for the long run average cost using the so-called vanishing discount approach. Two examples are presented illustrating the possible applications of the results developed in the paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP`s) taking values in a general Borel space and with compact action space depending on the state variable. In order to do that we first derive some important properties for a pseudo-Poisson equation associated to the problem. In the sequence it is shown that the convergence of the PIA to a solution satisfying the optimality equation holds under some classical hypotheses and that this optimal solution yields to an optimal control strategy for the average control problem for the continuous-time PDMP in a feedback form.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is concerned with the existence of an optimal control strategy for the long-run average continuous control problem of piecewise-deterministic Markov processes (PDMPs). In Costa and Dufour (2008), sufficient conditions were derived to ensure the existence of an optimal control by using the vanishing discount approach. These conditions were mainly expressed in terms of the relative difference of the alpha-discount value functions. The main goal of this paper is to derive tractable conditions directly related to the primitive data of the PDMP to ensure the existence of an optimal control. The present work can be seen as a continuation of the results derived in Costa and Dufour (2008). Our main assumptions are written in terms of some integro-differential inequalities related to the so-called expected growth condition, and geometric convergence of the post-jump location kernel associated to the PDMP. An example based on the capacity expansion problem is presented, illustrating the possible applications of the results developed in the paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a stochastic model for disability insurance contracts. The model is based on a discrete time non-homogeneous semi-Markov process (DTNHSMP) to which the backward recurrence time process is introduced. This permits a more exhaustive study of disability evolution and a more efficient approach to the duration problem. The use of semi-Markov reward processes facilitates the possibility of deriving equations of the prospective and retrospective mathematical reserves. The model is applied to a sample of contracts drawn at random from a mutual insurance company.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the aftermath of a large-scale disaster, agents' decisions derive from self-interested (e.g. survival), common-good (e.g. victims' rescue) and teamwork (e.g. fire extinction) motivations. However, current decision-theoretic models are either purely individual or purely collective and find it difficult to deal with motivational attitudes; on the other hand, mental-state based models find it difficult to deal with uncertainty. We propose a hybrid, CvI-JI, approach that combines: i) collective 'versus' individual (CvI) decisions, founded on the Markov decision process (MDP) quantitative evaluation of joint-actions, and ii)joint-intentions (JI) formulation of teamwork, founded on the belief-desire-intention (BDI) architecture of general mental-state based reasoning. The CvI-JI evaluation explores the performance's improvement

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Tiwi people of northern Australia have managed natural resources continuously for 6000-8000 years. Tiwi management objectives and outcomes may reflect how they gather information about the environment. We qualitatively analyzed Tiwi documents and management techniques to examine the relation between the social and physical environment of decision makers and their decision-making strategies. We hypothesized that principles of bounded rationality, namely, the use of efficient rules to navigate complex decision problems, explain how Tiwi managers use simple decision strategies (i.e., heuristics) to make robust decisions. Tiwi natural resource managers reduced complexity in decision making through a process that gathers incomplete and uncertain information to quickly guide decisions toward effective outcomes. They used management feedback to validate decisions through an information loop that resulted in long-term sustainability of environmental use. We examined the Tiwi decision-making processes relative to management of barramundi (Lates calcarifer) fisheries and contrasted their management with the state government's management of barramundi. Decisions that enhanced the status of individual people and their attainment of aspiration levels resulted in reliable resource availability for Tiwi consumers. Different decision processes adopted by the state for management of barramundi may not secure similarly sustainable outcomes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

It has been repeatedly debated which strategies people rely on in inference. These debates have been difficult to resolve, partially because hypotheses about the decision processes assumed by these strategies have typically been formulated qualitatively, making it hard to test precise quantitative predictions about response times and other behavioral data. One way to increase the precision of strategies is to implement them in cognitive architectures such as ACT-R. Often, however, a given strategy can be implemented in several ways, with each implementation yielding different behavioral predictions. We present and report a study with an experimental paradigm that can help to identify the correct implementations of classic compensatory and non-compensatory strategies such as the take-the-best and tallying heuristics, and the weighted-linear model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Scientific studies regarding specifically references do not seem to exist. However, the utilization of references is an important practice for many companies involved in industrial marketing. The purpose of the study is to increase the understanding about the utilization of references in international industrial marketing in order to contribute to the development of a theory of reference behavior. Specifically, the modes of reference usage in industry, the factors affecting a supplier's reference behavior, and the question how references are actually utilized, are explored in the study. Due to the explorative nature of the study, a research design was followed where theory and empirical studies alternated. An Exploratory Framework was developed to guide a pilot case study that resulted in Framework 1. Results of the pilot study guided an expanded literature review that was used to develop first a Structural Framework and a Process Framework which were combined in Framework 2. Then, the second empirical phase of the case study was conducted in the same (pilot) case company. In this phase, Decision Systems Analysis (DSA) was used as the analysis method. The DSA procedure consists of three interviewing waves: initial interviews, reinterviews, and validating interviews. Four reference decision processes were identified, described and analyzed in the form of flowchart descriptions. The flowchart descriptions were used to explore new constructs and to develop new propositions to develop Framework 2 further. The quality of the study was ascertained by many actions in both empirical parts of the study. The construct validity of the study was ascertained by using multiple sources of evidence and by asking the key informant to review the pilot case report. The DSA method itself includes procedures assuring validity. Because of the choice to conduct a single case study, external validity was not even pursued. High reliability was pursued through detailed documentation and thorough reporting of evidence. It was concluded that the core of the concept of reference is a customer relationship regardless of the concrete forms a reference might take in its utilization. Depending on various contingencies, references might have various tasks inside the four roles of increasing 1) efficiency of sales and sales management, 2) efficiency of the business, 3) effectiveness of marketing activities, and 4) effectiveness in establishing, maintaining and enhancing customer relationships. Thus, references have not only external but internal tasks as well. A supplier's reference behavior might be affected by many hierarchical conditions. Additionally, the empirical study showed that the supplier can utilize its references as a continuous, all pervasive decision making process through various practices. The process includes both individual and unstructured decision making subprocesses. The proposed concept of reference can be used to guide a reference policy recommendable for companies for which the utilization of references is important. The significance of the study is threefold: proposing the concept of reference, developing a framework of a supplier's reference behavior and its short term process of utilizing references, and conceptual structuring of an unstructured and in industrial marketing important phenomenon to four roles.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Previous research has shown that often there is clear inertia in individual decision making---that is, a tendency for decision makers to choose a status quo option. I conduct a laboratory experiment to investigate two potential determinants of inertia in uncertain environments: (i) regret aversion and (ii) ambiguity-driven indecisiveness. I use a between-subjects design with varying conditions to identify the effects of these two mechanisms on choice behavior. In each condition, participants choose between two simple real gambles, one of which is the status quo option. I find that inertia is quite large and that both mechanisms are equally important.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

ABSTRACT: The femtocell concept aims to combine fixed-line broadband access with mobile telephony using the deployment of low-cost, low-power third and fourth generation base stations in the subscribers' homes. While the self-configuration of femtocells is a plus, it can limit the quality of service (QoS) for the users and reduce the efficiency of the network, based on outdated allocation parameters such as signal power level. To this end, this paper presents a proposal for optimized allocation of users on a co-channel macro-femto network, that enable self-configuration and public access, aiming to maximize the quality of service of applications and using more efficiently the available energy, seeking the concept of Green networking. Thus, when the user needs to connect to make a voice or a data call, the mobile phone has to decide which network to connect, using the information of number of connections, the QoS parameters (packet loss and throughput) and the signal power level of each network. For this purpose, the system is modeled as a Markov Decision Process, which is formulated to obtain an optimal policy that can be applied on the mobile phone. The policy created is flexible, allowing different analyzes, and adaptive to the specific characteristics defined by the telephone company. The results show that compared to traditional QoS approaches, the policy proposed here can improve energy efficiency by up to 10%.