3 resultados para Public policy evaluation

em Indian Institute of Science - Bangalore - Índia


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that proposed a method of solution of Finite-Horizon Markov Decision Processes (FH-MDPs). The extensions developed are in three directions: a) Use of the dynamic programming principle in the policy update step of SAMW b) A two-timescale actor-critic algorithm that uses simulated transitions alone, and c) Extending the algorithm to the infinite-horizon discounted-reward scenario. In particular, a) reduces the storage required from exponential to linear in the number of actions per stage-state pair. On the faster timescale, a 'critic' recursion performs policy evaluation while on the slower timescale an 'actor' recursion performs policy improvement using SAMW. We give a proof outlining convergence w.p. 1 and show experimental results on two settings: semiconductor fabrication and flow control in communication networks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper analyses the influence of management on Technical Efficiency Change (TEC) and Technological Progress (TP) in the communication equipment and consumer electronics sub-sectors of Indian hardware electronics industry. Each sub-sector comprises 13 sample firms for two time periods.The primary objective is to determine the relative contribution of TP and TEC to TFP Growth (TFPG) and to establish the influence of firm specific operational management decision variables on these two components. The study finds that both the sub-sectors have strived and achieved steady TP but not TEC in the period of economic liberalisation to cope with the intensifying competition. The management decisions with respect to asset and profit utilization, vertical integration, among others, improved TP and TE in the sub-sectors. However, R&D investments and technology imports proved costly for TFP indicating inadequate efforts and/or poor resource utilisation by the management. Management was found to be complacent in terms of improving or developing their own technology as indicated by their higher dependence on import of raw materials and no influence of R&D on TP.