975 resultados para Stochastic Translog Cost Frontier


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present two efficient discrete parameter simulation optimization (DPSO) algorithms for the long-run average cost objective. One of these algorithms uses the smoothed functional approximation (SFA) procedure, while the other is based on simultaneous perturbation stochastic approximation (SPSA). The use of SFA for DPSO had not been proposed previously in the literature. Further, both algorithms adopt an interesting technique of random projections that we present here for the first time. We give a proof of convergence of our algorithms. Next, we present detailed numerical experiments on a problem of admission control with dependent service times. We consider two different settings involving parameter sets that have moderate and large sizes, respectively. On the first setting, we also show performance comparisons with the well-studied optimal computing budget allocation (OCBA) algorithm and also the equal allocation algorithm. Note to Practitioners-Even though SPSA and SFA have been devised in the literature for continuous optimization problems, our results indicate that they can be powerful techniques even when they are adapted to discrete optimization settings. OCBA is widely recognized as one of the most powerful methods for discrete optimization when the parameter sets are of small or moderate size. On a setting involving a parameter set of size 100, we observe that when the computing budget is small, both SPSA and OCBA show similar performance and are better in comparison to SFA, however, as the computing budget is increased, SPSA and SFA show better performance than OCBA. Both our algorithms also show good performance when the parameter set has a size of 10(8). SFA is seen to show the best overall performance. Unlike most other DPSO algorithms in the literature, an advantage with our algorithms is that they are easily implementable regardless of the size of the parameter sets and show good performance in both scenarios.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose for the first time two reinforcement learning algorithms with function approximation for average cost adaptive control of traffic lights. One of these algorithms is a version of Q-learning with function approximation while the other is a policy gradient actor-critic algorithm that incorporates multi-timescale stochastic approximation. We show performance comparisons on various network settings of these algorithms with a range of fixed timing algorithms, as well as a Q-learning algorithm with full state representation that we also implement. We observe that whereas (as expected) on a two-junction corridor, the full state representation algorithm shows the best results, this algorithm is not implementable on larger road networks. The algorithm PG-AC-TLC that we propose is seen to show the best overall performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Infinite horizon discounted-cost and ergodic-cost risk-sensitive zero-sum stochastic games for controlled Markov chains with countably many states are analyzed. Upper and lower values for these games are established. The existence of value and saddle-point equilibria in the class of Markov strategies is proved for the discounted-cost game. The existence of value and saddle-point equilibria in the class of stationary strategies is proved under the uniform ergodicity condition for the ergodic-cost game. The value of the ergodic-cost game happens to be the product of the inverse of the risk-sensitivity factor and the logarithm of the common Perron-Frobenius eigenvalue of the associated controlled nonlinear kernels. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Energy harvesting sensor nodes are gaining popularity due to their ability to improve the network life time and are becoming a preferred choice supporting green communication. In this paper, we focus on communicating reliably over an additive white Gaussian noise channel using such an energy harvesting sensor node. An important part of this paper involves appropriate modeling of energy harvesting, as done via various practical architectures. Our main result is the characterization of the Shannon capacity of the communication system. The key technical challenge involves dealing with the dynamic (and stochastic) nature of the (quadratic) cost of the input to the channel. As a corollary, we find close connections between the capacity achieving energy management policies and the queueing theoretic throughput optimal policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Response analysis of a linear structure with uncertainties in both structural parameters and external excitation is considered here. When such an analysis is carried out using the spectral stochastic finite element method (SSFEM), often the computational cost tends to be prohibitive due to the rapid growth of the number of spectral bases with the number of random variables and the order of expansion. For instance, if the excitation contains a random frequency, or if it is a general random process, then a good approximation of these excitations using polynomial chaos expansion (PCE) involves a large number of terms, which leads to very high cost. To address this issue of high computational cost, a hybrid method is proposed in this work. In this method, first the random eigenvalue problem is solved using the weak formulation of SSFEM, which involves solving a system of deterministic nonlinear algebraic equations to estimate the PCE coefficients of the random eigenvalues and eigenvectors. Then the response is estimated using a Monte Carlo (MC) simulation, where the modal bases are sampled from the PCE of the random eigenvectors estimated in the previous step, followed by a numerical time integration. It is observed through numerical studies that this proposed method successfully reduces the computational burden compared with either a pure SSFEM of a pure MC simulation and more accurate than a perturbation method. The computational gain improves as the problem size in terms of degrees of freedom grows. It also improves as the timespan of interest reduces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we first derive a necessary and sufficient condition for a stationary strategy to be the Nash equilibrium of discounted constrained stochastic game under certain assumptions. In this process we also develop a nonlinear (non-convex) optimization problem for a discounted constrained stochastic game. We use the linear best response functions of every player and complementary slackness theorem for linear programs to derive both the optimization problem and the equivalent condition. We then extend this result to average reward constrained stochastic games. Finally, we present a heuristic algorithm motivated by our necessary and sufficient conditions for a discounted cost constrained stochastic game. We numerically observe the convergence of this algorithm to Nash equilibrium. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article, we look at the political business cycle problem through the lens of uncertainty. The feedback control used by us is the famous NKPC with stochasticity and wage rigidities. We extend the New Keynesian Phillips Curve model to the continuous time stochastic set up with an Ornstein-Uhlenbeck process. We minimize relevant expected quadratic cost by solving the corresponding Hamilton-Jacobi-Bellman equation. The basic intuition of the classical model is qualitatively carried forward in our set up but uncertainty also plays an important role in determining the optimal trajectory of the voter support function. The internal variability of the system acts as a base shifter for the support function in the risk neutral case. The role of uncertainty is even more prominent in the risk averse case where all the shape parameters are directly dependent on variability. Thus, in this case variability controls both the rates of change as well as the base shift parameters. To gain more insight we have also studied the model when the coefficients are time invariant and studied numerical solutions. The close relationship between the unemployment rate and the support function for the incumbent party is highlighted. The role of uncertainty in creating sampling fluctuation in this set up, possibly towards apparently anomalous results, is also explored.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Partial differential equations (PDEs) with multiscale coefficients are very difficult to solve due to the wide range of scales in the solutions. In the thesis, we propose some efficient numerical methods for both deterministic and stochastic PDEs based on the model reduction technique.

For the deterministic PDEs, the main purpose of our method is to derive an effective equation for the multiscale problem. An essential ingredient is to decompose the harmonic coordinate into a smooth part and a highly oscillatory part of which the magnitude is small. Such a decomposition plays a key role in our construction of the effective equation. We show that the solution to the effective equation is smooth, and could be resolved on a regular coarse mesh grid. Furthermore, we provide error analysis and show that the solution to the effective equation plus a correction term is close to the original multiscale solution.

For the stochastic PDEs, we propose the model reduction based data-driven stochastic method and multilevel Monte Carlo method. In the multiquery, setting and on the assumption that the ratio of the smallest scale and largest scale is not too small, we propose the multiscale data-driven stochastic method. We construct a data-driven stochastic basis and solve the coupled deterministic PDEs to obtain the solutions. For the tougher problems, we propose the multiscale multilevel Monte Carlo method. We apply the multilevel scheme to the effective equations and assemble the stiffness matrices efficiently on each coarse mesh grid. In both methods, the $\KL$ expansion plays an important role in extracting the main parts of some stochastic quantities.

For both the deterministic and stochastic PDEs, numerical results are presented to demonstrate the accuracy and robustness of the methods. We also show the computational time cost reduction in the numerical examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Hamilton Jacobi Bellman (HJB) equation is central to stochastic optimal control (SOC) theory, yielding the optimal solution to general problems specified by known dynamics and a specified cost functional. Given the assumption of quadratic cost on the control input, it is well known that the HJB reduces to a particular partial differential equation (PDE). While powerful, this reduction is not commonly used as the PDE is of second order, is nonlinear, and examples exist where the problem may not have a solution in a classical sense. Furthermore, each state of the system appears as another dimension of the PDE, giving rise to the curse of dimensionality. Since the number of degrees of freedom required to solve the optimal control problem grows exponentially with dimension, the problem becomes intractable for systems with all but modest dimension.

In the last decade researchers have found that under certain, fairly non-restrictive structural assumptions, the HJB may be transformed into a linear PDE, with an interesting analogue in the discretized domain of Markov Decision Processes (MDP). The work presented in this thesis uses the linearity of this particular form of the HJB PDE to push the computational boundaries of stochastic optimal control.

This is done by crafting together previously disjoint lines of research in computation. The first of these is the use of Sum of Squares (SOS) techniques for synthesis of control policies. A candidate polynomial with variable coefficients is proposed as the solution to the stochastic optimal control problem. An SOS relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function. It is shown that these results extend to arbitrary parabolic and elliptic PDEs, yielding a novel method for Uncertainty Quantification (UQ) of systems governed by partial differential constraints. Domain decomposition techniques are also made available, allowing for such problems to be solved via parallelization and low-order polynomials.

The optimization-based SOS technique is then contrasted with the Separated Representation (SR) approach from the applied mathematics community. The technique allows for systems of equations to be solved through a low-rank decomposition that results in algorithms that scale linearly with dimensionality. Its application in stochastic optimal control allows for previously uncomputable problems to be solved quickly, scaling to such complex systems as the Quadcopter and VTOL aircraft. This technique may be combined with the SOS approach, yielding not only a numerical technique, but also an analytical one that allows for entirely new classes of systems to be studied and for stability properties to be guaranteed.

The analysis of the linear HJB is completed by the study of its implications in application. It is shown that the HJB and a popular technique in robotics, the use of navigation functions, sit on opposite ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. Analytical solutions to the HJB in these settings are available in simplified domains, yielding guidance towards optimality for approximation schemes. Finally, the use of HJB equations in temporal multi-task planning problems is investigated. It is demonstrated that such problems are reducible to a sequence of SOC problems linked via boundary conditions. The linearity of the PDE allows us to pre-compute control policy primitives and then compose them, at essentially zero cost, to satisfy a complex temporal logic specification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examined the technical efficiency in artisanal fisheries in Lagos State of Nigeria. The study employed a two stage random sampling procedure for the selection of 120 respondents. The analytical techniques involved descriptive statistics and estimation of technical efficiency following maximum likelihood estimation (MLE) procedure available in FRONTIER 4.1. The MLE result of the stochastic frontier production function showed that hired labour, cost of repair and capital items are critical factors that influences productivity of artisanal fishermen with the coefficient of hired labour being highly elastic. This implies that employing more labour will significantly increase the catch in the study area. The predicted farm efficiency with an average value of 0.92 showed that there is a marginal potential of about 8 percent to increase the catch, hence the income of the fishermen. The study further examined the factors that influence productivity of fishermen in the study area. Year of education, mode of operation and frequency of fishing have important implication on the technical efficiency of fishermen in the study area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The optimal control of problems that are constrained by partial differential equations with uncertainties and with uncertain controls is addressed. The Lagrangian that defines the problem is postulated in terms of stochastic functions, with the control function possibly decomposed into an unknown deterministic component and a known zero-mean stochastic component. The extra freedom provided by the stochastic dimension in defining cost functionals is explored, demonstrating the scope for controlling statistical aspects of the system response. One-shot stochastic finite element methods are used to find approximate solutions to control problems. It is shown that applying the stochastic collocation finite element method to the formulated problem leads to a coupling between stochastic collocation points when a deterministic optimal control is considered or when moments are included in the cost functional, thereby forgoing the primary advantage of the collocation method over the stochastic Galerkin method for the considered problem. The application of the presented methods is demonstrated through a number of numerical examples. The presented framework is sufficiently general to also consider a class of inverse problems, and numerical examples of this type are also presented. © 2011 Elsevier B.V.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We uncovered the underlying energy landscape of the mitogen-activated protein kinases signal transduction cellular network by exploring the statistical natures of the Brownian dynamical trajectories. We introduce a dimensionless quantity: The robustness ratio of energy gap versus local roughness to measure the global topography of the underlying landscape. A high robustness ratio implies funneled landscape. The landscape is quite robust against environmental fluctuations and variants of the intrinsic chemical reaction rates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding relationship between environmental protection and economic development is crucial to form practical environmental policy. At micro level, implementation of environmental regulations often causes production mills adjustment of technology which might leads to change of productive efficiency and cost, which, in turn, determine effort level of mills and even local government in pollution control. Using a stochastic frontier production model and a set of survey data on 126 paper mills from six provinces of China, we measure the technical efficiency changes and analyze the determinants of efficiency. in particular, we examine impact of environmental policy on paper mills' efficiency, using an indicator of environmental policy-the levy ratio of COD. We also estimate a simultaneous-equation model in which the levy rate and emission are jointly determined. The results indicate that there have been efficiency improvements during 1999-2003, when enforcement of environmental regulations have been tightened. The impacts, nevertheless, are different for different types of mills. We also find the levy ratio, which is influenced by both the local social and economic conditions and the characters of paper mills, such as scale, has strong impact on the abatement of the pollutant-COD. Additionally, paper mills' technical efficiency has positive effect on the reduction of the emission intensity of the pollutant-COD. These results lead a set of implications pertinent to policy improvement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For a digital echo canceller it is desirable to reduce the adaptation time, during which the transmission of useful data is not possible. LMS is a non-optimal algorithm in this case as the signals involved are statistically non-Gaussian. Walach and Widrow (IEEE Trans. Inform. Theory 30 (2) (March 1984) 275-283) investigated the use of a power of 4, while other research established algorithms with arbitrary integer (Pei and Tseng, IEEE J. Selected Areas Commun. 12(9)(December 1994) 1540-1547) or non-quadratic power (Shah and Cowan, IEE.Proc.-Vis. Image Signal Process. 142 (3) (June 1995) 187-191). This paper suggests that continuous and automatic, adaptation of the error exponent gives a more satisfactory result. The family of cost function adaptation (CFA) stochastic gradient algorithm proposed allows an increase in convergence rate and, an improvement of residual error. As special case the staircase CFA algorithm is first presented, then the smooth CFA is developed. Details of implementations are also discussed. Results of simulation are provided to show the properties of the proposed family of algorithms. (C) 2000 Elsevier Science B.V. All rights reserved.