Biblioteca Digital

27 resultados para buying decision process

Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we consider an intrusion detection application for Wireless Sensor Networks. We study the problem of scheduling the sleep times of the individual sensors, where the objective is to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous stateaction spaces, in a manner similar to Fuemmeler and Veeravalli (IEEE Trans Signal Process 56(5), 2091-2101, 2008). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation. Feature-based representations and function approximation is necessary to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation architecture for the Q-values) is updated in an on-policy temporal difference algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model and this is useful in settings where the latter is not known. Our simulation results on a synthetic 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work.

Veja mais

Adaptive Sleep-Wake Control using Reinforcement Learning in Sensor Networks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The aim in this paper is to allocate the `sleep time' of the individual sensors in an intrusion detection application so that the energy consumption from the sensors is reduced, while keeping the tracking error to a minimum. We propose two novel reinforcement learning (RL) based algorithms that attempt to minimize a certain long-run average cost objective. Both our algorithms incorporate feature-based representations to handle the curse of dimensionality associated with the underlying partially-observable Markov decision process (POMDP). Further, the feature selection scheme used in our algorithms intelligently manages the energy cost and tracking cost factors, which in turn assists the search for the optimal sleeping policy. We also extend these algorithms to a setting where the intruder's mobility model is not known by incorporating a stochastic iterative scheme for estimating the mobility model. The simulation results on a synthetic 2-d network setting are encouraging.

Veja mais

Relay Selection with Channel Probing in Sleep-Wake Cycling Wireless Sensor Networks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In geographical forwarding of packets in a large wireless sensor network (WSN) with sleep-wake cycling nodes, we are interested in the local decision problem faced by a node that has ``custody'' of a packet and has to choose one among a set of next-hop relay nodes to forward the packet toward the sink. Each relay is associated with a ``reward'' that summarizes the benefit of forwarding the packet through that relay. We seek a solution to this local problem, the idea being that such a solution, if adopted by every node, could provide a reasonable heuristic for the end-to-end forwarding problem. Toward this end, we propose a local relay selection problem consisting of a forwarding node and a collection of relay nodes, with the relays waking up sequentially at random times. At each relay wake-up instant, the forwarder can choose to probe a relay to learn its reward value, based on which the forwarder can then decide whether to stop (and forward its packet to the chosen relay) or to continue to wait for further relays to wake up. The forwarder's objective is to select a relay so as to minimize a combination of waiting delay, reward, and probing cost. The local decision problem can be considered as a variant of the asset selling problem studied in the operations research literature. We formulate the local problem as a Markov decision process (MDP) and characterize the solution in terms of stopping sets and probing sets. We provide results illustrating the structure of the stopping sets, namely, the (lower bound) threshold and the stage independence properties. Regarding the probing sets, we make an interesting conjecture that these sets are characterized by upper bounds. Through simulation experiments, we provide valuable insights into the performance of the optimal local forwarding and its use as an end-to-end forwarding heuristic.

Veja mais

Multi-agent Reinforcement Learning for Traffic Signal Control

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Optimal control of traffic lights at junctions or traffic signal control (TSC) is essential for reducing the average delay experienced by the road users amidst the rapid increase in the usage of vehicles. In this paper, we formulate the TSC problem as a discounted cost Markov decision process (MDP) and apply multi-agent reinforcement learning (MARL) algorithms to obtain dynamic TSC policies. We model each traffic signal junction as an independent agent. An agent decides the signal duration of its phases in a round-robin (RR) manner using multi-agent Q-learning with either is an element of-greedy or UCB 3] based exploration strategies. It updates its Q-factors based on the cost feedback signal received from its neighbouring agents. This feedback signal can be easily constructed and is shown to be effective in minimizing the average delay of the vehicles in the network. We show through simulations over VISSIM that our algorithms perform significantly better than both the standard fixed signal timing (FST) algorithm and the saturation balancing (SAT) algorithm 15] over two real road networks.

Veja mais

Training-Based Antenna Selection for PER Minimization: A POMDP Approach

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper considers the problem of receive antenna selection (AS) in a multiple-antenna communication system having a single radio-frequency (RF) chain. The AS decisions are based on noisy channel estimates obtained using known pilot symbols embedded in the data packets. The goal here is to minimize the average packet error rate (PER) by exploiting the known temporal correlation of the channel. As the underlying channels are only partially observed using the pilot symbols, the problem of AS for PER minimization is cast into a partially observable Markov decision process (POMDP) framework. Under mild assumptions, the optimality of a myopic policy is established for the two-state channel case. Moreover, two heuristic AS schemes are proposed based on a weighted combination of the estimated channel states on the different antennas. These schemes utilize the continuous valued received pilot symbols to make the AS decisions, and are shown to offer performance comparable to the POMDP approach, which requires one to quantize the channel and observations to a finite set of states. The performance improvement offered by the POMDP solution and the proposed heuristic solutions relative to existing AS training-based approaches is illustrated using Monte Carlo simulations.

Veja mais

As-You-Go Deployment of a 2-Connected Wireless Relay Network for Sensor-Sink Interconnection

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A person walks along a line (which could be an idealisation of a forest trail, for example), placing relays as he walks, in order to create a multihop network for connecting a sensor at a point along the line to a sink at the start of the line. The potential placement points are equally spaced along the line, and at each such location the decision to place or not to place a relay is based on link quality measurements to the previously placed relays. The location of the sensor is unknown apriori, and is discovered as the deployment agent walks. In this paper, we extend our earlier work on this class of problems to include the objective of achieving a 2-connected multihop network. We propose a network cost objective that is additive over the deployed relays, and accounts for possible alternate routing over the multiple available paths. As in our earlier work, the problem is formulated as a Markov decision process. Placement algorithms are obtained for two source location models, which yield a discounted cost MDP and an average cost MDP. In each case we obtain structural results for an optimal policy, and perform a numerical study that provides insights into the advantages and disadvantages of multi-connectivity. We validate the results obtained from numerical study experimentally in a forest-like environment.

Veja mais

Validation of Decision Tables Used in Process Control

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Process control rules may be specified using decision tables. Such a specification is superior when logical decisions to be taken in control dominate. In this paper we give a method of detecting redundancies, incompleteness, and contradictions in such specifications. Using such a technique thus ensures the validity of the specifications.

Veja mais

Information and Decision Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Production scheduling in a flexible manufacturing system (FMS) is a real-time combinatorial optimization problem that has been proved to be NP-complete. Solving this problem needs on-line monitoring of plan execution and requires real-time decision-making in selecting alternative routings, assigning required resources, and rescheduling when failures occur in the system. Expert systems provide a natural framework for solving this kind of NP-complete problems.In this paper an expert system with a novel parallel heuristic approach is implemented for automatic short-term dynamic scheduling of FMS. The principal features of the expert system presented in this paper include easy rescheduling, on-line plan execution, load balancing, an on-line garbage collection process, and the use of advanced knowledge representational schemes. Its effectiveness is demonstrated with two examples.

Veja mais

Video coding mode decision as a classification problem

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we show that it is possible to reduce the complexity of Intra MB coding in H.264/AVC based on a novel chance constrained classifier. Using the pairs of simple mean-variances values, our technique is able to reduce the complexity of Intra MB coding process with a negligible loss in PSNR. We present an alternate approach to address the classification problem which is equivalent to machine learning. Implementation results show that the proposed method reduces encoding time to about 20% of the reference implementation with average loss of 0.05 dB in PSNR.

Veja mais

Analysis of an LMS linear equalizer for fading channels in decision directed mode

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a time varying wireless fading channel, equalized by an LMS linear equalizer in decision directed mode (DD-LMS-LE). We study how well this equalizer tracks the optimal Wiener equalizer. Initially we study a fixed channel.For a fixed channel, we obtain the existence of DD attractors near the Wiener filter at high SNRs using an ODE (Ordinary Differential Equation) approximating the DD-LMS-LE. We also show, via examples, that the DD attractors may not be close to the Wiener filters at low SNRs. Next we study a time varying fading channel modeled by an Auto-regressive (AR) process of order 2. The DD-LMS equalizer and the AR process are jointly approximated by the solution of a system of ODEs. We show via examples that the LMS equalizer ODE show tracks the ODE corresponding to the instantaneous Wiener filter when the SNR is high. This may not happen at low SNRs.

Veja mais

Tracking analysis of an LMS decision feedback equalizer for a wireless channel

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a time varying wireless fading channel, equalized by an LMS Decision Feedback equalizer (DFE). We study how well this equalizer tracks the optimal MMSEDFE (Wiener) equalizer. We model the channel by an Autoregressive (AR) process. Then the LMS equalizer and the AR process are jointly approximated by the solution of a system of ODEs (ordinary differential equations). Using these ODEs, we show via some examples that the LMS equalizer moves close to the instantaneous Wiener filter after initial transience. We also compare the LMS equalizer with the instantaneous optimal DFE (the commonly used Wiener filter) designed assuming perfect previous decisions and computed using perfect channel estimate (we will call it as IDFE). We show that the LMS equalizer outperforms the IDFE almost all the time after initial transience.

Veja mais

A unification-based decision procedure for cryptographic protocol analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a sound and complete decision procedure for the bounded process cryptographic protocol insecurity problem, based on the notion of normal proofs [2] and classical unification. We also show a result about the existence of attacks with “high” normal cuts. Our proof of correctness provides an alternate proof and new insights into the fundamental result of Rusinowitch and Turuani [9] for the same setting.

Veja mais

27 resultados para buying decision process

Filtro por publicador