1000 resultados para decision systemm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process (MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multi-stage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce and study a class of non-stationary semi-Markov decision processes on a finite horizon. By constructing an equivalent Markov decision process, we establish the existence of a piecewise open loop relaxed control which is optimal for the finite horizon problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the inherent feedback in a decision feedback equalizer (DFE) the minimum mean square error (MMSE) or Wiener solution is not known exactly. The main difficulty in such analysis is due to the propagation of the decision errors, which occur because of the feedback. Thus in literature, these errors are neglected while designing and/or analyzing the DFEs. Then a closed form expression is obtained for Wiener solution and we refer this as ideal DFE (IDFE). DFE has also been designed using an iterative and computationally efficient alternative called least mean square (LMS) algorithm. However, again due to the feedback involved, the analysis of an LMS-DFE is not known so far. In this paper we theoretically analyze a DFE taking into account the decision errors. We study its performance at steady state. We then study an LMS-DFE and show the proximity of LMS-DFE attractors to that of the optimal DFE Wiener filter (obtained after considering the decision errors) at high signal to noise ratios (SNR). Further, via simulations we demonstrate that, even at moderate SNRs, an LMS-DFE is close to the MSE optimal DFE. Finally, we compare the LMS DFE attractors with IDFE via simulations. We show that an LMS equalizer outperforms the IDFE. In fact, the performance improvement is very significant even at high SNRs (up to 33%), where an IDFE is believed to be closer to the optimal one. Towards the end, we briefly discuss the tracking properties of the LMS-DFE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel multi-timescale Q-learning algorithm for average cost control in a Markov decision process subject to multiple inequality constraints. We formulate a relaxed version of this problem through the Lagrange multiplier method. Our algorithm is different from Q-learning in that it updates two parameters - a Q-value parameter and a policy parameter. The Q-value parameter is updated on a slower time scale as compared to the policy parameter. Whereas Q-learning with function approximation can diverge in some cases, our algorithm is seen to be convergent as a result of the aforementioned timescale separation. We show the results of experiments on a problem of constrained routing in a multistage queueing network. Our algorithm is seen to exhibit good performance and the various inequality constraints are seen to be satisfied upon convergence of the algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper considers antenna selection (AS) at a receiver equipped with multiple antenna elements but only a single radio frequency chain for packet reception. As information about the channel state is acquired using training symbols (pilots), the receiver makes its AS decisions based on noisy channel estimates. Additional information that can be exploited for AS includes the time-correlation of the wireless channel and the results of the link-layer error checks upon receiving the data packets. In this scenario, the task of the receiver is to sequentially select (a) the pilot symbol allocation, i.e., how to distribute the available pilot symbols among the antenna elements, for channel estimation on each of the receive antennas; and (b) the antenna to be used for data packet reception. The goal is to maximize the expected throughput, based on the past history of allocation and selection decisions, and the corresponding noisy channel estimates and error check results. Since the channel state is only partially observed through the noisy pilots and the error checks, the joint problem of pilot allocation and AS is modeled as a partially observed Markov decision process (POMDP). The solution to the POMDP yields the policy that maximizes the long-term expected throughput. Using the Finite State Markov Chain (FSMC) model for the wireless channel, the performance of the POMDP solution is compared with that of other existing schemes, and it is illustrated through numerical evaluation that the POMDP solution significantly outperforms them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the problem of finding optimal power control policies for wireless energy harvesting sensor (EHS) nodes with automatic repeat request (ARQ)-based packet transmissions. The EHS harvests energy from the environment according to a Bernoulli process; and it is required to operate within the constraint of energy neutrality. The EHS obtains partial channel state information (CSI) at the transmitter through the link-layer ARQ protocol, via the ACK/NACK feedback messages, and uses it to adapt the transmission power for the packet (re)transmission attempts. The underlying wireless fading channel is modeled as a finite state Markov chain with known transition probabilities. Thus, the goal of the power management policy is to determine the best power setting for the current packet transmission attempt, so as to maximize a long-run expected reward such as the expected outage probability. The problem is addressed in a decision-theoretic framework by casting it as a partially observable Markov decision process (POMDP). Due to the large size of the state-space, the exact solution to the POMDP is computationally expensive. Hence, two popular approximate solutions are considered, which yield good power management policies for the transmission attempts. Monte Carlo simulation results illustrate the efficacy of the approach and show that the approximate solutions significantly outperform conventional approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wildlife conservation in human-dominated landscapes requires that we understand how animals, when making habitat-use decisions, obtain diverse and dynamically occurring resources while avoiding risks, induced by both natural predators and anthropogenic threats. Little is known about the underlying processes that enable wild animals to persist in densely populated human-dominated landscapes, particularly in developing countries. In a complex, semi-arid, fragmented, human-dominated agricultural landscape, we analyzed the habitat-use of blackbuck, a large herbivore endemic to the Indian sub-continent. We hypothesized that blackbuck would show flexible habitat-use behaviour and be risk averse when resource quality in the landscape is high, and less sensitive to risk otherwise. Overall, blackbuck appeared to be strongly influenced by human activity and they offset risks by using small protected patches (similar to 3 km(2)) when they could afford to do so. Blackbuck habitat use varied dynamically corresponding with seasonally-changing levels of resources and risks, with protected habitats registering maximum use. The findings show that human activities can strongly influence and perhaps limit ungulate habitat-use and behaviour, but spatial heterogeneity in risk, particularly the presence of refuges, can allow ungulates to persist in landscapes with high human and livestock densities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes an approach to structuring the make or buy decision process, basing it firmly in the context of an overall manufacturing strategy. The work has been carried out jointly by the University of Cambridge Manufacturing Engineering Group and Lucas Industries. A review of the current state of ideas surrounding the linked issues of vertical integration and make or buy decisions is presented. Important features of the approach include identification of core manufacturing capabilities, assessment of the role of technology in manufacturing, the development of a cost model to support make or buy decisions and a review of the strategic implications of varying degrees of vertical integration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resumen: Mientras que el marketing está asociado con prácticas negativas que involucran la explotación y la deshonestidad, Anton Jamnik afirma la necesidad de crear una teoría ética para éste. El artículo intenta brindar, por un lado, un breve bosquejo de las principales corrientes de la literatura de la ética del marketing y, por otro, participar de su desarrollo. El autor analiza los desafíos éticos que sur girán en el futuro, provenientes de tres fuentes distintas: las innovaciones tecnológicas, la influencia de la competencia global y la expansión de las actividades de mercado en áreas no tradicionales. Esto requerirá el desarrollo de una ética normativa realista. Para concluir, explica que la ética del marketing debería analizar hasta qué punto ha sido exitosa a la hora de resolver los desafíos éticos del mundo actual.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.