Biblioteca Digital

52 resultados para GRIBOV HORIZON

em Indian Institute of Science - Bangalore - Índia

Parametrized actor-critic algorithms for finite-horizon MDPs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.

A reinforcement learning based algorithm for finite horizon Markov decision processes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Optimal infinite-horizon feedback laws for a general class of constrained discrete-time systems: stability and moving-horizon approximations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Stability results are given for a class of feedback systems arising from the regulation of time-varying discrete-time systems using optimal infinite-horizon and moving-horizon feedback laws. The class is characterized by joint constraints on the state and the control, a general nonlinear cost function and nonlinear equations of motion possessing two special properties. It is shown that weak conditions on the cost function and the constraints are sufficient to guarantee uniform asymptotic stability of both the optimal infinite-horizon and movinghorizon feedback systems. The infinite-horizon cost associated with the moving-horizon feedback law approaches the optimal infinite-horizon cost as the moving horizon is extended.

An Actor-Critic Algorithm for Finite Horizon Markov Decision Processes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Parameter estimation of a two-horizon soil profile by combining crop canopy and surface soil moisture observations using GLUE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estimation of soil parameters by inverse modeling using observations on either surface soil moisture or crop variables has been successfully attempted in many studies, but difficulties to estimate root zone properties arise when heterogeneous layered soils are considered. The objective of this study was to explore the potential of combining observations on surface soil moisture and crop variables - leaf area index (LAI) and above-ground biomass for estimating soil parameters (water holding capacity and soil depth) in a two-layered soil system using inversion of the crop model STICS. This was performed using GLUE method on a synthetic data set on varying soil types and on a data set from a field experiment carried out in two maize plots in South India. The main results were (i) combination of surface soil moisture and above-ground biomass provided consistently good estimates with small uncertainity of soil properties for the two soil layers, for a wide range of soil paramater values, both in the synthetic and the field experiment, (ii) above-ground biomass was found to give relatively better estimates and lower uncertainty than LAI when combined with surface soil moisture, especially for estimation of soil depth, (iii) surface soil moisture data, either alone or combined with crop variables, provided a very good estimate of the water holding capacity of the upper soil layer with very small uncertainty whereas using the surface soil moisture alone gave very poor estimates of the soil properties of the deeper layer, and (iv) using crop variables alone (else above-ground biomass or LAI) provided reasonable estimates of the deeper layer properties depending on the soil type but provided poor estimates of the first layer properties. The robustness of combining observations of the surface soil moisture and the above-ground biomass for estimating two layer soil properties, which was demonstrated using both synthetic and field experiments in this study, needs now to be tested for a broader range of climatic conditions and crop types, to assess its potential for spatial applications. (C) 2012 Elsevier B.V. All rights reserved.

Non-Stationary Semi-Markov Decision Processes on a Finite Horizon

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce and study a class of non-stationary semi-Markov decision processes on a finite horizon. By constructing an equivalent Markov decision process, we establish the existence of a piecewise open loop relaxed control which is optimal for the finite horizon problem.

Model Predictive Static Programming: A Computationally Efficient Technique For Suboptimal Control Design

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Combining the philosophies of nonlinear model predictive control and approximate dynamic programming, a new suboptimal control design technique is presented in this paper, named as model predictive static programming (MPSP), which is applicable for finite-horizon nonlinear problems with terminal constraints. This technique is computationally efficient, and hence, can possibly be implemented online. The effectiveness of the proposed method is demonstrated by designing an ascent phase guidance scheme for a ballistic missile propelled by solid motors. A comparison study with a conventional gradient method shows that the MPSP solution is quite close to the optimal solution.

Portfolio Optimization in a Semi-Markov Modulated Market

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We address a portfolio optimization problem in a semi-Markov modulated market. We study both the terminal expected utility optimization on finite time horizon and the risk-sensitive portfolio optimization on finite and infinite time horizon. We obtain optimal portfolios in relevant cases. A numerical procedure is also developed to compute the optimal expected terminal utility for finite horizon problem.

Singularity-free cosmology: A simple model

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The nonminimal coupling of a self-interacting complex scalar field with gravity is studied. For a Robertson-Walker open universe the stable solutions of the scalar-field equations are time dependent. As a result of this, a novel spontaneous symmetry breaking occurs which leads to a varying effective gravitational coupling coefficient. It is found that the coupling coefficient changes sign below a critical ‘‘radius’’ of the Universe implying the appearance of repulsive gravity. The occurrence of the repulsive interaction at an early epoch facilitates singularity avoidance. The model also provides a solution to the horizon problem.

Optimal simultaneous maximum a posterioriestimation of states, noise statistics and parameters I. Algorithm

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The simultaneous state and parameter estimation problem for a linear discrete-time system with unknown noise statistics is treated as a large-scale optimization problem. The a posterioriprobability density function is maximized directly with respect to the states and parameters subject to the constraint of the system dynamics. The resulting optimization problem is too large for any of the standard non-linear programming techniques and hence an hierarchical optimization approach is proposed. It turns out that the states can be computed at the first levelfor given noise and system parameters. These, in turn, are to be modified at the second level.The states are to be computed from a large system of linear equations and two solution methods are considered for solving these equations, limiting the horizon to a suitable length. The resulting algorithm is a filter-smoother, suitable for off-line as well as on-line state estimation for given noise and system parameters. The second level problem is split up into two, one for modifying the noise statistics and the other for modifying the system parameters. An adaptive relaxation technique is proposed for modifying the noise statistics and a modified Gauss-Newton technique is used to adjust the system parameters.

Two-temperature accretion around rotating black holes: a description of the general advective flow paradigm in the presence of various cooling processes to explain low to high luminous sources

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate viscous two-temperature accretion disc flows around rotating black holes. We describe the global solution of accretion flows with a sub-Keplerian angular momentum profile, by solving the underlying conservation equations including explicit cooling processes self-consistently. Bremsstrahlung, synchrotron and inverse Comptonization of soft photons are considered as possible cooling mechanisms. We focus on the set of solutions for sub-Eddington, Eddington and super-Eddington mass accretion rates around Schwarzschild and Kerr black holes with a Kerr parameter of 0.998. It is found that the flow, during its infall from the Keplerian to sub-Kepleria transition region to the black hole event horizon, passes through various phases of advection: the general advective paradigm to the radiatively inefficient phase, and vice versa. Hence, the flow governs a much lower electron temperature similar to 10(8)-10(9.5) K, in the range of accretion rate in Eddington units 0.01 less than or similar to (M) over dot less than or similar to 100, compared to the hot protons of temperature similar to 10(10.2)-10(11.8) K. Therefore, the solution may potentially explain the hard X-rays and gamma-rays emitted from active galactic nuclei (AGNs) and X-ray binaries. We then compare the solutions for two different regimes of viscosity. We conclude that a weakly viscous flow is expected to be cooling dominated, particularly at the inner region of the disc, compared to its highly viscous counterpart, which is radiatively inefficient. With all the solutions in hand, we finally reproduce the observed luminosities of the underfed AGNs and quasars (e. g. Sgr A*) to ultraluminous X-ray sources (e. g. SS433), at different combinations of input parameters, such as the mass accretion rate and the ratio of specific heats. The set of solutions also predicts appropriately the luminosity observed in highly luminous AGNs and ultraluminous quasars (e. g. PKS 0743-67).

Using a structural approach to identify relationships between soil and erosion in a semi-humid forested area, South India

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biogeochemical and hydrological cycles are currently studied on a small experimental forested watershed (4.5 km(2)) in the semi-humid South India. This paper presents one of the first data referring to the distribution and dynamics of a widespread red soil (Ferralsols and Chromic Luvisols) and black soil (Vertisols and Vertic intergrades) cover, and its possible relationship with the recent development of the erosion process. The soil map was established from the observation of isolated soil profiles and toposequences, and surveys of soil electromagnetic conductivity (EM31, Geonics Ltd), lithology and vegetation. The distribution of the different parts of the soil cover in relation to each other was used to establish the dynamics and chronological order of formation. Results indicate that both topography and lithology (gneiss and amphibolite) have influenced the distribution of the soils. At the downslope, the following parts of the soil covers were distinguished: i) red soil system, ii) black soil system, iii) bleached horizon at the top of the black soil and iv) bleached sandy saprolite at the base of the black soil. The red soil is currently transforming into black soil and the transformation front is moving upslope. In the bottom part of the slope, the chronology appears to be the following: black soil > bleached horizon at the top of the black soil > streambed > bleached horizon below the black soil. It appears that the development of the drainage network is a recent process, which was guided by the presence of thin black soil with a vertic horizon less than 2 in deep. Three distinctive types of erosional landforms have been identified: 1. rotational slips (Type 1); 2. a seepage erosion (Type 2) at the top of the black soil profile; 3. A combination of earthflow and sliding in the non-cohesive saprolite of the gneiss occurs at midslope (Type 3). Types 1 and 2 erosion are mainly occurring downslope and are always located at the intersection between the streambed and the red soil-black soil contact. Neutron probe monitoring, along an area vulnerable to erosion types 1 and 2, indicates that rotational slips are caused by a temporary watertable at the base of the black soil and within the sandy bleached saprolite, which behaves as a plane of weakness. The watertable is induced by the ephemeral watercourse. Erosion type 2 is caused by seepage of a perched watertable, which occurs after swelling and closing of the cracks of the vertic clay horizon and within a light textured and bleached horizon at the top of black soil. Type 3 erosion is not related to the red soil-black soil system but is caused by the seasonal seepage of saturated throughflow in the sandy saprolite of the gneiss occurring at midslope. (c) 2006 Elsevier B.V. All rights reserved.

A Fair Contract for Managing Water Scarcity

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In public utilities, under supply constraints, fairness considerations lead to a market failure. This paper characterizes a two-period principal-agent contract for demand management, that mitigates this market failure in urban water systems. The contract is designed as an extensive form mechanism using subgame perfect Nash equilibrium (SPNE) as the solution concept. The contract is fair; and is shown to be economically efficient if, in case of deviation by the agent, the gain to the agent and the loss to the principal are small. It is shown that the assumption can be avoided in an infinite horizon contract.

Solving MDPs using two-timescale simulated annealing with multiplicative weights

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that proposed a method of solution of Finite-Horizon Markov Decision Processes (FH-MDPs). The extensions developed are in three directions: a) Use of the dynamic programming principle in the policy update step of SAMW b) A two-timescale actor-critic algorithm that uses simulated transitions alone, and c) Extending the algorithm to the infinite-horizon discounted-reward scenario. In particular, a) reduces the storage required from exponential to linear in the number of actions per stage-state pair. On the faster timescale, a 'critic' recursion performs policy evaluation while on the slower timescale an 'actor' recursion performs policy improvement using SAMW. We give a proof outlining convergence w.p. 1 and show experimental results on two settings: semiconductor fabrication and flow control in communication networks.

Solution of MDPS using simulation-based value iteration

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article proposes a three-timescale simulation based algorithm for solution of infinite horizon Markov Decision Processes (MDPs). We assume a finite state space and discounted cost criterion and adopt the value iteration approach. An approximation of the Dynamic Programming operator T is applied to the value function iterates. This 'approximate' operator is implemented using three timescales, the slowest of which updates the value function iterates. On the middle timescale we perform a gradient search over the feasible action set of each state using Simultaneous Perturbation Stochastic Approximation (SPSA) gradient estimates, thus finding the minimizing action in T. On the fastest timescale, the 'critic' estimates, over which the gradient search is performed, are obtained. A sketch of convergence explaining the dynamics of the algorithm using associated ODEs is also presented. Numerical experiments on rate based flow control on a bottleneck node using a continuous-time queueing model are performed using the proposed algorithm. The results obtained are verified against classical value iteration where the feasible set is suitably discretized. Over such a discretized setting, a variant of the algorithm of [12] is compared and the proposed algorithm is found to converge faster.

«
1
2
3
4
»