5 resultados para decision framework

em Indian Institute of Science - Bangalore - Índia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Production scheduling in a flexible manufacturing system (FMS) is a real-time combinatorial optimization problem that has been proved to be NP-complete. Solving this problem needs on-line monitoring of plan execution and requires real-time decision-making in selecting alternative routings, assigning required resources, and rescheduling when failures occur in the system. Expert systems provide a natural framework for solving this kind of NP-complete problems.In this paper an expert system with a novel parallel heuristic approach is implemented for automatic short-term dynamic scheduling of FMS. The principal features of the expert system presented in this paper include easy rescheduling, on-line plan execution, load balancing, an on-line garbage collection process, and the use of advanced knowledge representational schemes. Its effectiveness is demonstrated with two examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we study an LMS-DFE. We use the ODE framework to show that the LMS-DFE attractors are close to the true DFE Wiener filter (designed considering the decision errors) at high SNR. Therefore, via LMS one can obtain a computationally efficient way to obtain the true DFE Wiener filter under high SNR. We also provide examples to show that the DFE filter so obtained can significantly outperform the usual DFE Wiener filter (designed assuming perfect decisions) at all practical SNRs. In fact, the performance improvement is very significant even at high SNRs (up to 50%), where the popular Wiener filter designed with perfect decisions, is believed to be closer to the optimal one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process (MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multi-stage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the problem of finding optimal power control policies for wireless energy harvesting sensor (EHS) nodes with automatic repeat request (ARQ)-based packet transmissions. The EHS harvests energy from the environment according to a Bernoulli process; and it is required to operate within the constraint of energy neutrality. The EHS obtains partial channel state information (CSI) at the transmitter through the link-layer ARQ protocol, via the ACK/NACK feedback messages, and uses it to adapt the transmission power for the packet (re)transmission attempts. The underlying wireless fading channel is modeled as a finite state Markov chain with known transition probabilities. Thus, the goal of the power management policy is to determine the best power setting for the current packet transmission attempt, so as to maximize a long-run expected reward such as the expected outage probability. The problem is addressed in a decision-theoretic framework by casting it as a partially observable Markov decision process (POMDP). Due to the large size of the state-space, the exact solution to the POMDP is computationally expensive. Hence, two popular approximate solutions are considered, which yield good power management policies for the transmission attempts. Monte Carlo simulation results illustrate the efficacy of the approach and show that the approximate solutions significantly outperform conventional approaches.