976 resultados para Value function


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper uses dynamic programming to study the time consistency of optimal macroeconomic policy in economies with recurring public deficits. To this end, a general equilibrium recursive model introduced in Chang (1998) is extended to include govemment bonds and production. The original mode! presents a Sidrauski economy with money and transfers only, implying that the need for govemment fmancing through the inflation tax is minimal. The extended model introduces govemment expenditures and a deficit-financing scheme, analyzing the SargentWallace (1981) problem: recurring deficits may lead the govemment to default on part of its public debt through inflation. The methodology allows for the computation of the set of alI sustainable stabilization plans even when the govemment cannot pre-commit to an optimal inflation path. This is done through value function iterations, which can be done on a computeI. The parameters of the extended model are calibrated with Brazilian data, using as case study three Brazilian stabilization attempts: the Cruzado (1986), Collor (1990) and the Real (1994) plans. The calibration of the parameters of the extended model is straightforward, but its numerical solution proves unfeasible due to a dimensionality problem in the algorithm arising from limitations of available computer technology. However, a numerical solution using the original algorithm and some calibrated parameters is obtained. Results indicate that in the absence of govemment bonds or production only the Real Plan is sustainable in the long run. The numerical solution of the extended algorithm is left for future research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Transaction costs have a random component in the bid-ask spread. Facing a high bid-ask spread, the consumer has the option to wait for better terms oI' trade, but only by carrying an undesirable portfolio balance. We present the best policy in this case. We pose the control problem and show that the value function is the uni que viscosity solution of the relevant variational inequality. Next, a numerical procedure for the problem is presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper studies the asymptotic optimality of discrete-time Markov decision processes (MDPs) with general state space and action space and having weak and strong interactions. By using a similar approach as developed by Liu, Zhang, and Yin [Appl. Math. Optim., 44 (2001), pp. 105-129], the idea in this paper is to consider an MDP with general state and action spaces and to reduce the dimension of the state space by considering an averaged model. This formulation is often described by introducing a small parameter epsilon > 0 in the definition of the transition kernel, leading to a singularly perturbed Markov model with two time scales. Our objective is twofold. First it is shown that the value function of the control problem for the perturbed system converges to the value function of a limit averaged control problem as epsilon goes to zero. In the second part of the paper, it is proved that a feedback control policy for the original control problem defined by using an optimal feedback policy for the limit problem is asymptotically optimal. Our work extends existing results of the literature in the following two directions: the underlying MDP is defined on general state and action spaces and we do not impose strong conditions on the recurrence structure of the MDP such as Doeblin's condition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Die vorliegende Arbeit beschäftigt sich mit der Entwicklung eines Funktionsapproximators und dessen Verwendung in Verfahren zum Lernen von diskreten und kontinuierlichen Aktionen: 1. Ein allgemeiner Funktionsapproximator – Locally Weighted Interpolating Growing Neural Gas (LWIGNG) – wird auf Basis eines Wachsenden Neuralen Gases (GNG) entwickelt. Die topologische Nachbarschaft in der Neuronenstruktur wird verwendet, um zwischen benachbarten Neuronen zu interpolieren und durch lokale Gewichtung die Approximation zu berechnen. Die Leistungsfähigkeit des Ansatzes, insbesondere in Hinsicht auf sich verändernde Zielfunktionen und sich verändernde Eingabeverteilungen, wird in verschiedenen Experimenten unter Beweis gestellt. 2. Zum Lernen diskreter Aktionen wird das LWIGNG-Verfahren mit Q-Learning zur Q-LWIGNG-Methode verbunden. Dafür muss der zugrunde liegende GNG-Algorithmus abgeändert werden, da die Eingabedaten beim Aktionenlernen eine bestimmte Reihenfolge haben. Q-LWIGNG erzielt sehr gute Ergebnisse beim Stabbalance- und beim Mountain-Car-Problem und gute Ergebnisse beim Acrobot-Problem. 3. Zum Lernen kontinuierlicher Aktionen wird ein REINFORCE-Algorithmus mit LWIGNG zur ReinforceGNG-Methode verbunden. Dabei wird eine Actor-Critic-Architektur eingesetzt, um aus zeitverzögerten Belohnungen zu lernen. LWIGNG approximiert sowohl die Zustands-Wertefunktion als auch die Politik, die in Form von situationsabhängigen Parametern einer Normalverteilung repräsentiert wird. ReinforceGNG wird erfolgreich zum Lernen von Bewegungen für einen simulierten 2-rädrigen Roboter eingesetzt, der einen rollenden Ball unter bestimmten Bedingungen abfangen soll.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study investigates the role of credit risk in a continuous time stochastic asset allocation model, since the traditional dynamic framework does not provide credit risk flexibility. The general model of the study extends the traditional dynamic efficiency framework by explicitly deriving the optimal value function for the infinite horizon stochastic control problem via a weighted volatility measure of market and credit risk. The model's optimal strategy was then compared to that obtained from a benchmark Markowitz-type dynamic optimization framework to determine which specification adequately reflects the optimal terminal investment returns and strategy under credit and market risks. The paper shows that an investor's optimal terminal return is lower than typically indicated under the traditional mean-variance framework during periods of elevated credit risk. Hence I conclude that, while the traditional dynamic mean-variance approach may indicate the ideal, in the presence of credit-risk it does not accurately reflect the observed optimal returns, terminal wealth and portfolio selection strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We analyze the effect of environmental uncertainties on optimal fishery management in a bio-economic fishery model. Unlike most of the literature on resource economics, but in line with ecological models, we allow the different biological processes of survival and recruitment to be affected differently by environmental uncertainties. We show that the overall effect of uncertainty on the optimal size of a fish stock is ambiguous, depending on the prudence of the value function. For the case of a risk-neutral fishery manager, the overall effect depends on the relative magnitude of two opposing effects, the 'convex-cost effect' and the 'gambling effect'. We apply the analysis to the Baltic cod and the North Sea herring fisheries, concluding that for risk neutral agents the net effect of environmental uncertainties on the optimal size of these fish stocks is negative, albeit small in absolute value. Under risk aversion, the effect on optimal stock size is positive for sufficiently high coefficients of constant relative risk aversion.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La escasez de recursos, el cambio climático, la pobreza y el subdesarrollo, los desastres naturales, son solo algunos de los grandes retos a que se enfrenta la humanidad y a los que la economía verde y el desarrollo sostenible tienen que dar respuesta. El concepto sostenible surge a raíz de la necesidad de lograr en todas las actividades humanas un nuevo equilibrio con el medioambiente, la sociedad y la economía, es decir un desarrollo más sostenible. La construcción supone en este nuevo concepto un sector básico, con grandes impactos en los recursos, los residuos, las emisiones, la biodiversidad, el paisaje, las necesidades sociales, la integración, el desarrollo económico del entorno, etc. Es por ello, que la construcción sostenible tiene una importancia esencial como demuestra su amplia aplicación teórica y práctica ya en proyectos de planificación urbana y de edificación. En la ingeniería civil estas aproximaciones son todavía mínimas, aunque ya se están considerando ciertos criterios de sostenibilidad en proyectos de construcción. La construcción consume muchos recursos naturales, económicos y tiene gran incidencia social. En la actualidad su actividad consume un 30% de los recursos extraídos de la tierra y la energía, y en consecuencia genera el 30% de los gases de efecto invernadero y residuos sólidos del mundo (EEA, 2014). Este impacto debería suponer una gran responsabilidad para los profesionales y gobiernos que toman cada día las decisiones de diseño e inversión en la construcción, y su máxima eficiencia debería estar muy presente entre los objetivos. En esta tesis doctoral se plantea un nuevo modelo para la evaluación de la sostenibilidad en los proyectos mediante un sistema de indicadores, basados en las áreas de estudio de las certificaciones de sostenibilidad existentes y en un análisis multi-criterio de cada uno de los axiomas de la sostenibilidad. Como reto principal se marca la propuesta de una metodología que permita identificar, priorizar y seleccionar los indicadores y las variables más importantes de lo que es considerado como una construcción sostenible en el caso de infraestructuras ferroviarias, más concretamente en puentes ferroviarios, y que además sirva para priorizar nuevos proyectos que se adapten a los nuevos objetivos del desarrollo sostenible: el respeto al medioambiente, la integración social y la económica. El objetivo es la aplicación de estos indicadores desde las etapas más tempranas del proyecto: planificación, diseño de alternativas y selección de alternativas. Para ello, en primer lugar, se ha realizado un análisis en profundidad de los distintas organizaciones de certificación de la sostenibilidad mundiales y se ha desarrollado una comparativa entre ellas, detallando el funcionamiento de las más extendidas (BREEAM, LEED, VERDE, DGNB). Tras esto, se ha analizado la herramienta matemática MIVES de análisis multi-criterio para su aplicación, en la tesis, a las infraestructuras ferroviarias. En la segunda parte se desarrolla para las estructuras ferroviarias un nuevo modelo de indicadores, un sistema de ayuda a la decisión multi-criterio basado en los tres axiomas de las sostenibilidad (sociedad, medioambiente y economía), articulados en un árbol de requerimientos inspirado en el método MIVES, que propone una metodología para el caso de las infraestructuras ferroviarias. La metodología MIVES estructura el proceso de decisión en tres ramas: Requisitos, componentes y ciclo de vida. Estas ramas definen los límites de los sistemas. El eje de los requisitos del árbol de los requisitos o se estructura en tres niveles que corresponden al requisito específico: criterios e indicadores. Además, es necesario definen la función del valor para cada indicador, definen el peso de importancia de cada elemento del árbol y finalmente con el calcular el valor de cada alternativa selecciona el mejor de él. La generación de este árbol de requerimientos en estructuras ferroviarias y la medición de los parámetro es original para este tipo de estructuras. Por último, tras el desarrollo de la metodología, se ha aplicado la propuesta metodológica mediante la implementación práctica, utilizando el método propuesto con 2 puentes ferroviarios existentes. Los resultados han mostrado que la herramienta es capaz de establecer una ordenación de las actuaciones coherente y suficientemente discriminante como para que el decisor no tenga dudas cuando deba tomar la decisión. Esta fase, es una de las grandes aportaciones de la tesis, ya que permite diferenciar los pesos obtenidos en cada una de las áreas de estudio y donde la toma de decisión puede variar dependiendo de las necesidades del decisor, la ubicación del puente de estudio etc. ABSTRACT Scarce resources, climate change, poverty and underdevelopment, natural disasters are just some of the great challenges facing humanity and to which the green economy will have to respond. The sustainable concept arises from the need for all human activities in a new equilibrium with the environment, society and the economy, which is known as sustainable development. The construction industry is part of this concept, because of its major impacts on resources, waste, emissions, biodiversity, landscape, social needs, integration, economical development, environment, etc. Therefore, sustainable construction has a critical importance as already demonstrated by its wide application and theoretical practice in urban planning and building projects. In civil engineering, these approaches are still minimal, although some criteria are already taken into account for sustainability in infrastructure projects. The construction industry requires a lot of natural resources, has a real economic relevance and a huge social impact. Currently, it consumes 40% of produced power as well as natural resources extracted from the earth and thus leads to an environmental impact of 40% regarding greenhouse gas emissions and solid wastes (EEA 2014). These repercussions should highly concern our governments and professional of this industry on the decisions they take regarding investments and designs. They must be inflexible in order to ensure that the main concern has to be a maximum efficiency. Major events like the COP21 held in Paris in December 2015 are a concrete signal of the worldwide awareness of the huge impact of each industry on climate. In this doctoral thesis a new model for the evaluation of the sustainability in the projects by means of a system of indicators, based on the areas of study of the existing certifications of sustainability and on an analysis considers multi-criterion of each one of the axioms of the sustainability. The primary aim of this thesis is to study the mode of application of sustainability in projects through a system of indicators. . The main challenge consists of create a methodology suitable to identify, prioritize and select the most important indicators which define if a building is sustainable in the specific case of railway infrastructures. The methodology will help to adapt future projects to the new goals of sustainable development which are respect of nature, social integration and economic relevance. A crucial point is the consideration of these indicators from the very beginning steps of the projects: planning, design and alternatives reflections. First of all, a complete inventory of all world energy certification organizations has been made in order to compare the most representative ones regarding their way of functioning (BREEAM, LEED, VERDE, DGNB). After this, mathematical tool MIVES of analysis has been analyzed multi-criterion for its application, in the thesis, to railway infrastructures. The second part of the thesis is aimed to develop a new model of indicators, inspired by the MIVES method, consisting in a decision-making system based on the 3 foundations of sustainability: nature impact, social concerns, and economic relevance. The methodology MIVES structures the decision process in three axes: Requirements, components and life cycle. These axes define the boundaries of the systems. The axis of requirements o tree requirements is structured in three levels corresponding to specific requirement: criteria and indicators. In addition, is necessary define the value function for each indicator, define the weight of importance of each element of the tree and finally with the calculate the value of each alternative select the best of them. The generation of this tree requirements in railway structures and measuring the parameter is original for this type of structures. Finally, after the development of the methodology, it has validated the methodology through practical implementation, applying the proposed method 2 existing railway bridges. The results showed that the tool is able to establish a coherent management of performances and discriminating enough so that the decision maker should not have doubts when making the decision. This phase, is one of the great contributions of the thesis, since it allows to differentiate the weights obtained in each one from the study areas and where the decision making can vary depending on the necessities of the decisor, the location of the bridge of study etc.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neste trabalho, estudamos propriedades de continuação única para as soluções da equação tipo Schrödinger com um ponto interação centrado em x=0, \\partial_tu=i(\\Delta_Z+V)u, onde V=V(x,t) é uma função de valor real e -\\Delta_Z é o operador escrito formalmente como \\[-\\Delta_Z=-\\frac\\frac{d^2}{dx^2}+Z\\delta_0,\\] sendo \\delta_0 a delta de Dirac centrada em zero e Z qualquer número real. Logo, usamos estes resultados para ver o possível fenômeno de concentração das soluções, que explodem, da equação de tipo Schrödinger não linear com um ponto de interação em x=0, \\[\\partial_tu=i(\\Delta_Zu+|u|^u),\\] com ho>5. Também, mostramos que para certas condições sobre o potencial dependente do tempo V, a equação linear em cima tem soluções não triviais.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Large amounts of information can be overwhelming and costly to process, especially when transmitting data over a network. A typical modern Geographical Information System (GIS) brings all types of data together based on the geographic component of the data and provides simple point-and-click query capabilities as well as complex analysis tools. Querying a Geographical Information System, however, can be prohibitively expensive due to the large amounts of data which may need to be processed. Since the use of GIS technology has grown dramatically in the past few years, there is now a need more than ever, to provide users with the fastest and least expensive query capabilities, especially since an approximated 80 % of data stored in corporate databases has a geographical component. However, not every application requires the same, high quality data for its processing. In this paper we address the issues of reducing the cost and response time of GIS queries by preaggregating data by compromising the data accuracy and precision. We present computational issues in generation of multi-level resolutions of spatial data and show that the problem of finding the best approximation for the given region and a real value function on this region, under a predictable error, in general is "NP-complete.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Optimal stochastic controller pushes the closed-loop behavior as close as possible to the desired one. The fully probabilistic design (FPD) uses probabilistic description of the desired closed loop and minimizes Kullback-Leibler divergence of the closed-loop description to the desired one. Practical exploitation of the fully probabilistic design control theory continues to be hindered by the computational complexities involved in numerically solving the associated stochastic dynamic programming problem. In particular very hard multivariate integration and an approximate interpolation of the involved multivariate functions. This paper proposes a new fully probabilistic contro algorithm that uses the adaptive critic methods to circumvent the need for explicitly evaluating the optimal value function, thereby dramatically reducing computational requirements. This is a main contribution of this short paper.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Smart grid technologies have given rise to a liberalised and decentralised electricity market, enabling energy providers and retailers to have a better understanding of the demand side and its response to pricing signals. This paper puts forward a reinforcement-learning-powered tool aiding an electricity retailer to define the tariff prices it offers, in a bid to optimise its retail strategy. In a competitive market, an energy retailer aims to simultaneously increase the number of contracted customers and its profit margin. We have abstracted the problem of deciding on a tariff price as faced by a retailer, as a semi-Markov decision problem (SMDP). A hierarchical reinforcement learning approach, MaxQ value function decomposition, is applied to solve the SMDP through interactions with the market. To evaluate our trading strategy, we developed a retailer agent (termed AstonTAC) that uses the proposed SMDP framework to act in an open multi-agent simulation environment, the Power Trading Agent Competition (Power TAC). An evaluation and analysis of the 2013 Power TAC finals show that AstonTAC successfully selects sell prices that attract as many customers as necessary to maximise the profit margin. Moreover, during the competition, AstonTAC was the only retailer agent performing well across all retail market settings.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bayesian nonparametric models, such as the Gaussian process and the Dirichlet process, have been extensively applied for target kinematics modeling in various applications including environmental monitoring, traffic planning, endangered species tracking, dynamic scene analysis, autonomous robot navigation, and human motion modeling. As shown by these successful applications, Bayesian nonparametric models are able to adjust their complexities adaptively from data as necessary, and are resistant to overfitting or underfitting. However, most existing works assume that the sensor measurements used to learn the Bayesian nonparametric target kinematics models are obtained a priori or that the target kinematics can be measured by the sensor at any given time throughout the task. Little work has been done for controlling the sensor with bounded field of view to obtain measurements of mobile targets that are most informative for reducing the uncertainty of the Bayesian nonparametric models. To present the systematic sensor planning approach to leaning Bayesian nonparametric models, the Gaussian process target kinematics model is introduced at first, which is capable of describing time-invariant spatial phenomena, such as ocean currents, temperature distributions and wind velocity fields. The Dirichlet process-Gaussian process target kinematics model is subsequently discussed for modeling mixture of mobile targets, such as pedestrian motion patterns.

Novel information theoretic functions are developed for these introduced Bayesian nonparametric target kinematics models to represent the expected utility of measurements as a function of sensor control inputs and random environmental variables. A Gaussian process expected Kullback Leibler divergence is developed as the expectation of the KL divergence between the current (prior) and posterior Gaussian process target kinematics models with respect to the future measurements. Then, this approach is extended to develop a new information value function that can be used to estimate target kinematics described by a Dirichlet process-Gaussian process mixture model. A theorem is proposed that shows the novel information theoretic functions are bounded. Based on this theorem, efficient estimators of the new information theoretic functions are designed, which are proved to be unbiased with the variance of the resultant approximation error decreasing linearly as the number of samples increases. Computational complexities for optimizing the novel information theoretic functions under sensor dynamics constraints are studied, and are proved to be NP-hard. A cumulative lower bound is then proposed to reduce the computational complexity to polynomial time.

Three sensor planning algorithms are developed according to the assumptions on the target kinematics and the sensor dynamics. For problems where the control space of the sensor is discrete, a greedy algorithm is proposed. The efficiency of the greedy algorithm is demonstrated by a numerical experiment with data of ocean currents obtained by moored buoys. A sweep line algorithm is developed for applications where the sensor control space is continuous and unconstrained. Synthetic simulations as well as physical experiments with ground robots and a surveillance camera are conducted to evaluate the performance of the sweep line algorithm. Moreover, a lexicographic algorithm is designed based on the cumulative lower bound of the novel information theoretic functions, for the scenario where the sensor dynamics are constrained. Numerical experiments with real data collected from indoor pedestrians by a commercial pan-tilt camera are performed to examine the lexicographic algorithm. Results from both the numerical simulations and the physical experiments show that the three sensor planning algorithms proposed in this dissertation based on the novel information theoretic functions are superior at learning the target kinematics with

little or no prior knowledge

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The challenge of detecting a change in the distribution of data is a sequential decision problem that is relevant to many engineering solutions, including quality control and machine and process monitoring. This dissertation develops techniques for exact solution of change-detection problems with discrete time and discrete observations. Change-detection problems are classified as Bayes or minimax based on the availability of information on the change-time distribution. A Bayes optimal solution uses prior information about the distribution of the change time to minimize the expected cost, whereas a minimax optimal solution minimizes the cost under the worst-case change-time distribution. Both types of problems are addressed. The most important result of the dissertation is the development of a polynomial-time algorithm for the solution of important classes of Markov Bayes change-detection problems. Existing techniques for epsilon-exact solution of partially observable Markov decision processes have complexity exponential in the number of observation symbols. A new algorithm, called constellation induction, exploits the concavity and Lipschitz continuity of the value function, and has complexity polynomial in the number of observation symbols. It is shown that change-detection problems with a geometric change-time distribution and identically- and independently-distributed observations before and after the change are solvable in polynomial time. Also, change-detection problems on hidden Markov models with a fixed number of recurrent states are solvable in polynomial time. A detailed implementation and analysis of the constellation-induction algorithm are provided. Exact solution methods are also established for several types of minimax change-detection problems. Finite-horizon problems with arbitrary observation distributions are modeled as extensive-form games and solved using linear programs. Infinite-horizon problems with linear penalty for detection delay and identically- and independently-distributed observations can be solved in polynomial time via epsilon-optimal parameterization of a cumulative-sum procedure. Finally, the properties of policies for change-detection problems are described and analyzed. Simple classes of formal languages are shown to be sufficient for epsilon-exact solution of change-detection problems, and methods for finding minimally sized policy representations are described.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Chapter 1: Under the average common value function, we select almost uniquely the mechanism that gives the seller the largest portion of the true value in the worst situation among all the direct mechanisms that are feasible, ex-post implementable and individually rational. Chapter 2: Strategy-proof, budget balanced, anonymous, envy-free linear mechanisms assign p identical objects to n agents. The efficiency loss is the largest ratio of surplus loss to efficient surplus, over all profiles of non-negative valuations. The smallest efficiency loss is uniquely achieved by the following simple allocation rule: assigns one object to each of the p−1 agents with the highest valuation, a large probability to the agent with the pth highest valuation, and the remaining probability to the agent with the (p+1)th highest valuation. When “envy freeness” is replaced by the weaker condition “voluntary participation”, the optimal mechanism differs only when p is much less than n. Chapter 3: One group is to be selected among a set of agents. Agents have preferences over the size of the group if they are selected; and preferences over size as well as the “stand-outside” option are single-peaked. We take a mechanism design approach and search for group selection mechanisms that are efficient, strategy-proof and individually rational. Two classes of such mechanisms are presented. The proposing mechanism allows agents to either maintain or shrink the group size following a fixed priority, and is characterized by group strategy-proofness. The voting mechanism enlarges the group size in each voting round, and achieves at least half of the maximum group size compatible with individual rationality.