986 resultados para Value Functions
Resumo:
This paper contributes with a unified formulation that merges previ- ous analysis on the prediction of the performance ( value function ) of certain sequence of actions ( policy ) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approxi- mated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the pro- posed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.
Resumo:
We investigate the role of local connectedness in utility theory and prove that any continuous total preorder on a locally connected separable space is continuously representable. This is a new simple criterion for the representability of continuous preferences, and is not a consequence of the standard theorems in utility theory that use conditions such as connectedness and separability, second countability, or path-connectedness. Finally we give applications to problems involving the existence of value functions in population ethics and to the problem of proving the existence of continuous utility functions in general equilibrium models with land as one of the commodities. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
This work is concerned with the existence of an optimal control strategy for the long-run average continuous control problem of piecewise-deterministic Markov processes (PDMPs). In Costa and Dufour (2008), sufficient conditions were derived to ensure the existence of an optimal control by using the vanishing discount approach. These conditions were mainly expressed in terms of the relative difference of the alpha-discount value functions. The main goal of this paper is to derive tractable conditions directly related to the primitive data of the PDMP to ensure the existence of an optimal control. The present work can be seen as a continuation of the results derived in Costa and Dufour (2008). Our main assumptions are written in terms of some integro-differential inequalities related to the so-called expected growth condition, and geometric convergence of the post-jump location kernel associated to the PDMP. An example based on the capacity expansion problem is presented, illustrating the possible applications of the results developed in the paper.
Resumo:
This paper deals with the expected discounted continuous control of piecewise deterministic Markov processes (PDMP`s) using a singular perturbation approach for dealing with rapidly oscillating parameters. The state space of the PDMP is written as the product of a finite set and a subset of the Euclidean space a""e (n) . The discrete part of the state, called the regime, characterizes the mode of operation of the physical system under consideration, and is supposed to have a fast (associated to a small parameter epsilon > 0) and a slow behavior. By using a similar approach as developed in Yin and Zhang (Continuous-Time Markov Chains and Applications: A Singular Perturbation Approach, Applications of Mathematics, vol. 37, Springer, New York, 1998, Chaps. 1 and 3) the idea in this paper is to reduce the number of regimes by considering an averaged model in which the regimes within the same class are aggregated through the quasi-stationary distribution so that the different states in this class are replaced by a single one. The main goal is to show that the value function of the control problem for the system driven by the perturbed Markov chain converges to the value function of this limit control problem as epsilon goes to zero. This convergence is obtained by, roughly speaking, showing that the infimum and supremum limits of the value functions satisfy two optimality inequalities as epsilon goes to zero. This enables us to show the result by invoking a uniqueness argument, without needing any kind of Lipschitz continuity condition.
Resumo:
A seleção de fornecedores é considerada atualmente estratégica para as empresas que estão inseridas em ambientes cada vez mais dinâmicos e exigentes. Nesta dissertação são determinados os critérios e métodos mais usados no problema de seleção de fornecedores. Para estes serem alcançados, analisaram-se artigos da área e de ilustres autores para assim se perceber quais os critérios das áreas mais influentes, na hora de tomada de decisão sobre os melhores fornecedores para as empresas. A partir deste estudo foi construído um inquérito de resposta curta, enviado a empresas a laborar em Portugal, para se obter as importâncias dadas aos critérios por parte das empresas. Com estas respostas conclui-se que critérios relacionados com a qualidade e o custo são os mais relevantes. Relativamente aos métodos, foram estudados teórica e praticamente, o AHP e o SMART. O primeiro por ser o mais referenciado nos artigos estudados e o segundo por ser o mais simples de implementar e usar. No SMART foram criadas as funções valor para regerem o funcionamento do método. Estas funções foram desenvolvidas de raiz, com base num estudo bibliográfico prévio para cada um dos subcritérios, para se entender qual o melhor tipo de função a aplicar definindo matematicamente melhor o comportamento de cada um deles. A tomada de decisão é bastante importante nas organizações, pois pode conduzir ao sucesso ou insucesso. Assim é explicado a envolvente da tomada de decisão, o problema da seleção dos fornecedores, como se desenvolve o processo de seleção e quais são os métodos existentes para auxiliar a escolha dos mesmos. Por fim é apresentado o modelo proposto baseado nos resultados obtidos através do inquérito, e a aplicação dos dois métodos (AHP e SMART) para um melhor entendimento dos mesmos.
Resumo:
This paper presents an improved version of an application whose goal is to provide a simple and intuitive way to use multicriteria decision methods in day-to-day decision problems. The application allows comparisons between several alternatives with several criteria, always keeping a permanent backup of both model and results, and provides a framework to incorporate new methods in the future. Developed in C#, the application implements the AHP, SMART and Value Functions methods.
Resumo:
We present an envelope theorem for establishing first-order conditions in decision problems involving continuous and discrete choices. Our theorem accommodates general dynamic programming problems, even with unbounded marginal utilities. And, unlike classical envelope theorems that focus only on differentiating value functions, we accommodate other endogenous functions such as default probabilities and interest rates. Our main technical ingredient is how we establish the differentiability of a function at a point: we sandwich the function between two differentiable functions from above and below. Our theory is widely applicable. In unsecured credit models, neither interest rates nor continuation values are globally differentiable. Nevertheless, we establish an Euler equation involving marginal prices and values. In adjustment cost models, we show that first-order conditions apply universally, even if optimal policies are not (S,s). Finally, we incorporate indivisible choices into a classic dynamic insurance analysis.
Resumo:
This paper studies the average control problem of discrete-time Markov Decision Processes (MDPs for short) with general state space, Feller transition probabilities, and possibly non-compact control constraint sets A(x). Two hypotheses are considered: either the cost function c is strictly unbounded or the multifunctions A(r)(x) = {a is an element of A(x) : c(x, a) <= r} are upper-semicontinuous and compact-valued for each real r. For these two cases we provide new results for the existence of a solution to the average-cost optimality equality and inequality using the vanishing discount approach. We also study the convergence of the policy iteration approach under these conditions. It should be pointed out that we do not make any assumptions regarding the convergence and the continuity of the limit function generated by the sequence of relative difference of the alpha-discounted value functions and the Poisson equations as often encountered in the literature. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
Economic theory distinguishes two concepts of utility: decision utility, objectively quantifiable by choices, and experienced utility, referring to the satisfaction by an obtainment. To date, experienced utility is typically measured with subjective ratings. This study intended to quantify experienced utility by global levels of neuronal activity. Neuronal activity was measured by means of electroencephalographic (EEG) responses to gain and omission of graded monetary rewards at the level of the EEG topography in human subjects. A novel analysis approach allowed approximating psychophysiological value functions for the experienced utility of monetary rewards. In addition, we identified the time windows of the event-related potentials (ERP) and the respective intracortical sources, in which variations in neuronal activity were significantly related to the value or valence of outcomes. Results indicate that value functions of experienced utility and regret disproportionally increase with monetary value, and thus contradict the compressing value functions of decision utility. The temporal pattern of outcome evaluation suggests an initial (∼250 ms) coarse evaluation regarding the valence, concurrent with a finer-grained evaluation of the value of gained rewards, whereas the evaluation of the value of omitted rewards emerges later. We hypothesize that this temporal double dissociation is explained by reward prediction errors. Finally, a late, yet unreported, reward-sensitive ERP topography (∼500 ms) was identified. The sources of these topographical covariations are estimated in the ventromedial prefrontal cortex, the medial frontal gyrus, the anterior and posterior cingulate cortex and the hippocampus/amygdala. The results provide important new evidence regarding “how,” “when,” and “where” the brain evaluates outcomes with different hedonic impact.
Resumo:
We propose a new kernel estimation of the cumulative distribution function based on transformation and on bias reducing techniques. We derive the optimal bandwidth that minimises the asymptotic integrated mean squared error. The simulation results show that our proposed kernel estimation improves alternative approaches when the variable has an extreme value distribution with heavy tail and the sample size is small.
Resumo:
In this work, the energy response functions of a CdTe detector were obtained by Monte Carlo (MC) simulation in the energy range from 5 to 160keV, using the PENELOPE code. In the response calculations the carrier transport features and the detector resolution were included. The computed energy response function was validated through comparison with experimental results obtained with (241)Am and (152)Eu sources. In order to investigate the influence of the correction by the detector response at diagnostic energy range, x-ray spectra were measured using a CdTe detector (model XR-100T, Amptek), and then corrected by the energy response of the detector using the stripping procedure. Results showed that the CdTe exhibits good energy response at low energies (below 40keV), showing only small distortions on the measured spectra. For energies below about 80keV, the contribution of the escape of Cd- and Te-K x-rays produce significant distortions on the measured x-ray spectra. For higher energies, the most important correction is the detector efficiency and the carrier trapping effects. The results showed that, after correction by the energy response, the measured spectra are in good agreement with those provided by a theoretical model of the literature. Finally, our results showed that the detailed knowledge of the response function and a proper correction procedure are fundamental for achieving more accurate spectra from which quality parameters (i.e., half-value layer and homogeneity coefficient) can be determined.
Resumo:
The integral of the Wigner function of a quantum-mechanical system over a region or its boundary in the classical phase plane, is called a quasiprobability integral. Unlike a true probability integral, its value may lie outside the interval [0, 1]. It is characterized by a corresponding selfadjoint operator, to be called a region or contour operator as appropriate, which is determined by the characteristic function of that region or contour. The spectral problem is studied for commuting families of region and contour operators associated with concentric discs and circles of given radius a. Their respective eigenvalues are determined as functions of a, in terms of the Gauss-Laguerre polynomials. These polynomials provide a basis of vectors in a Hilbert space carrying the positive discrete series representation of the algebra su(1, 1) approximate to so(2, 1). The explicit relation between the spectra of operators associated with discs and circles with proportional radii, is given in terms of the discrete variable Meixner polynomials.
Resumo:
The financial and economic analysis of investment projects is typically carried out using the technique of discounted cash flow (DCF) analysis. This module introduces concepts of discounting and DCF analysis for the derivation of project performance criteria such as net present value (NPV), internal rate of return (IRR) and benefit to cost (B/C) ratios. These concepts and criteria are introduced with respect to a simple example, for which calculations using MicroSoft Excel are demonstrated.
Resumo:
Conservation of biodiversity can generate considerable indirect economic value and this is being increasingly recognized in China. For a forest ecosystem type of a nature reserve, the most important of its values are its ecological functions which provide human beings and other living things with beneficial environmental services. These services include water conservancy, soil protection, CO2 fixation and O-2 release, nutrient cycling, pollutant decomposition, and disease and pest control. Based on a case study in Changbaishan Mountain Biosphere Reserve in Northeast China, this paper provides a monetary valuation of these services by using opportunity cost and alternative cost methods. Using such an approach, this reserve is valued at 510.11 million yuan (USD 61.68 mill.) per year, 10 times higher than the opportunity cost (51.78 mill. yuan/ha.a) for regular timber production. While China has heeded United Nations Environmental Program (UNEP)'s call for economic evaluation of ecological functions, the assessment techniques used need to be improved in China and in the West for reasons mentioned.