145 resultados para Asymptotically optimal policy
Resumo:
The standard quantum search algorithm lacks a feature, enjoyed by many classical algorithms, of having a fixed-point, i.e. a monotonic convergence towards the solution. Here we present two variations of the quantum search algorithm, which get around this limitation. The first replaces selective inversions in the algorithm by selective phase shifts of $\frac{\pi}{3}$. The second controls the selective inversion operations using two ancilla qubits, and irreversible measurement operations on the ancilla qubits drive the starting state towards the target state. Using $q$ oracle queries, these variations reduce the probability of finding a non-target state from $\epsilon$ to $\epsilon^{2q+1}$, which is asymptotically optimal. Similar ideas can lead to robust quantum algorithms, and provide conceptually new schemes for error correction.
Resumo:
Our work is motivated by geographical forwarding of sporadic alarm packets to a base station in a wireless sensor network (WSN), where the nodes are sleep-wake cycling periodically and asynchronously. We seek to develop local forwarding algorithms that can be tuned so as to tradeoff the end-to-end delay against a total cost, such as the hop count or total energy. Our approach is to solve, at each forwarding node enroute to the sink, the local forwarding problem of minimizing one-hop waiting delay subject to a lower bound constraint on a suitable reward offered by the next-hop relay; the constraint serves to tune the tradeoff. The reward metric used for the local problem is based on the end-to-end total cost objective (for instance, when the total cost is hop count, we choose to use the progress toward sink made by a relay as the reward). The forwarding node, to begin with, is uncertain about the number of relays, their wake-up times, and the reward values, but knows the probability distributions of these quantities. At each relay wake-up instant, when a relay reveals its reward value, the forwarding node's problem is to forward the packet or to wait for further relays to wake-up. In terms of the operations research literature, our work can be considered as a variant of the asset selling problem. We formulate our local forwarding problem as a partially observable Markov decision process (POMDP) and obtain inner and outer bounds for the optimal policy. Motivated by the computational complexity involved in the policies derived out of these bounds, we formulate an alternate simplified model, the optimal policy for which is a simple threshold rule. We provide simulation results to compare the performance of the inner and outer bound policies against the simple policy, and also against the optimal policy when the source knows the exact number of relays. Observing the good performance and the ease of implementation of the simple policy, we apply it to our motivating problem, i.e., local geographical routing of sporadic alarm packets in a large WSN. We compare the end-to-end performance (i.e., average total delay and average total cost) obtained by the simple policy, when used for local geographical forwarding, against that obtained by the globally optimal forwarding algorithm proposed by Kim et al. 1].
Resumo:
The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents, the problem of an efficient task allocation extends beyond that of solving an MDP and becomes that of designing a mechanism. Motivated by this fact, we propose a general mechanism which decides on an allocation rule for the tasks and resources and a payment rule to incentivize agents' participation and truthful reports. In contrast to related dynamic strategic control problems studied in recent literature, the problem studied here has interdependent values: the benefit of an allocation to the task owner is not simply a function of the characteristics of the task itself and the allocation, but also of the state of the resources. We introduce a dynamic extension of Mezzetti's two phase mechanism for interdependent valuations. In this changed setting, the proposed dynamic mechanism is efficient, within period ex-post incentive compatible, and within period ex-post individually rational.
Resumo:
Channel-aware assignment of sub-channels to users in the downlink of an OFDMA system demands extensive feedback of channel state information (CSI) to the base station. Since the feedback bandwidth is often very scarce, schemes that limit feedback are necessary. We develop a novel, low feedback splitting-based algorithm for assigning each sub-channel to its best user, i.e., the user with the highest gain for that sub-channel among all users. The key idea behind the algorithm is that, at any time, each user contends for the sub-channel on which it has the largest channel gain among the unallocated sub-channels. Unlike other existing schemes, the algorithm explicitly handles multiple access control aspects associated with the feedback of CSI. A tractable asymptotic analysis of a system with a large number of users helps design the algorithm. It yields 50% to 65% throughput gains compared to an asymptotically optimal one-bit feedback scheme, when the number of users is as small as 10 or as large as 1000. The algorithm is fast and distributed, and scales with the number of users.
Resumo:
We propose energy harvesting technologies and cooperative relaying techniques to power the devices and improve reliability. We propose schemes to (a) maximize the packet reception ratio (PRR) by cooperation and (b) minimize the average packet delay (APD) by cooperation amongst nodes. Our key result and insight from the testbed implementation is about total data transmitted by each relay. A greedy policy that relays more data under a good harvesting condition turns out to be a sub optimal policy. This is because, energy replenishment is a slow process. The optimal scheme offers a low APD and also improves PRR.
Resumo:
This paper addresses the problem of finding outage-optimal power control policies for wireless energy harvesting sensor (EHS) nodes with automatic repeat request (ARQ)-based packet transmissions. The power control policy of the EHS specifies the transmission power for each packet transmission attempt, based on all the information available at the EHS. In particular, the acknowledgement (ACK) or negative acknowledgement (NACK) messages received provide the EHS with partial information about the channel state. We solve the problem of finding an optimal power control policy by casting it as a partially observable Markov decision process (POMDP). We study the structure of the optimal power policy in two ways. First, for the special case of binary power levels at the EHS, we show that the optimal policy for the underlying Markov decision process (MDP) when the channel state is observable is a threshold policy in the battery state. Second, we benchmark the performance of the EHS by rigorously analyzing the outage probability of a general fixed-power transmission scheme, where the EHS uses a predetermined power level at each slot within the frame. Monte Carlo simulation results illustrate the performance of the POMDP approach and verify the accuracy of the analysis. They also show that the POMDP solutions can significantly outperform conventional ad hoc approaches.
Resumo:
We consider a scheduler for the downlink of a wireless channel when only partial channel-state information is available at the scheduler. We characterize the network stability region and provide two throughput-optimal scheduling policies. We also derive a deterministic bound on the mean packet delay in the network. Finally, we provide a throughput-optimal policy for the network under QoS constraints when real-time and rate-guaranteed data traffic may be present.
Resumo:
We study the problem of optimal sequential (''as-you-go'') deployment of wireless relay nodes, as a person walks along a line of random length (with a known distribution). The objective is to create an impromptu multihop wireless network for connecting a packet source to be placed at the end of the line with a sink node located at the starting point, to operate in the light traffic regime. In walking from the sink towards the source, at every step, measurements yield the transmit powers required to establish links to one or more previously placed nodes. Based on these measurements, at every step, a decision is made to place a relay node, the overall system objective being to minimize a linear combination of the expected sum power (or the expected maximum power) required to deliver a packet from the source to the sink node and the expected number of relay nodes deployed. For each of these two objectives, two different relay selection strategies are considered: (i) each relay communicates with the sink via its immediate previous relay, (ii) the communication path can skip some of the deployed relays. With appropriate modeling assumptions, we formulate each of these problems as a Markov decision process (MDP). We provide the optimal policy structures for all these cases, and provide illustrations of the policies and their performance, via numerical results, for some typical parameters.
Resumo:
A new class of exact-repair regenerating codes is constructed by stitching together shorter erasure correction codes, where the stitching pattern can be viewed as block designs. The proposed codes have the help-by-transfer property where the helper nodes simply transfer part of the stored data directly, without performing any computation. This embedded error correction structure makes the decoding process straightforward, and in some cases the complexity is very low. We show that this construction is able to achieve performance better than space-sharing between the minimum storage regenerating codes and the minimum repair-bandwidth regenerating codes, and it is the first class of codes to achieve this performance. In fact, it is shown that the proposed construction can achieve a nontrivial point on the optimal functional-repair tradeoff, and it is asymptotically optimal at high rate, i.e., it asymptotically approaches the minimum storage and the minimum repair-bandwidth simultaneously.
Resumo:
A person walks along a line (which could be an idealisation of a forest trail, for example), placing relays as he walks, in order to create a multihop network for connecting a sensor at a point along the line to a sink at the start of the line. The potential placement points are equally spaced along the line, and at each such location the decision to place or not to place a relay is based on link quality measurements to the previously placed relays. The location of the sensor is unknown apriori, and is discovered as the deployment agent walks. In this paper, we extend our earlier work on this class of problems to include the objective of achieving a 2-connected multihop network. We propose a network cost objective that is additive over the deployed relays, and accounts for possible alternate routing over the multiple available paths. As in our earlier work, the problem is formulated as a Markov decision process. Placement algorithms are obtained for two source location models, which yield a discounted cost MDP and an average cost MDP. In each case we obtain structural results for an optimal policy, and perform a numerical study that provides insights into the advantages and disadvantages of multi-connectivity. We validate the results obtained from numerical study experimentally in a forest-like environment.
Resumo:
The stochastic version of Pontryagin's maximum principle is applied to determine an optimal maintenance policy of equipment subject to random deterioration. The deterioration of the equipment with age is modelled as a random process. Next the model is generalized to include random catastrophic failure of the equipment. The optimal maintenance policy is derived for two special probability distributions of time to failure of the equipment, namely, exponential and Weibull distributions Both the salvage value and deterioration rate of the equipment are treated as state variables and the maintenance as a control variable. The result is illustrated by an example
Resumo:
We propose a simulation-based algorithm for computing the optimal pricing policy for a product under uncertain demand dynamics. We consider a parameterized stochastic differential equation (SDE) model for the uncertain demand dynamics of the product over the planning horizon. In particular, we consider a dynamic model that is an extension of the Bass model. The performance of our algorithm is compared to that of a myopic pricing policy and is shown to give better results. Two significant advantages with our algorithm are as follows: (a) it does not require information on the system model parameters if the SDE system state is known via either a simulation device or real data, and (b) as it works efficiently even for high-dimensional parameters, it uses the efficient smoothed functional gradient estimator.
Resumo:
Pricing is an effective tool to control congestion and achieve quality of service (QoS) provisioning for multiple differentiated levels of service. In this paper, we consider the problem of pricing for congestion control in the case of a network of nodes under a single service class and multiple queues, and present a multi-layered pricing scheme. We propose an algorithm for finding the optimal state dependent price levels for individual queues, at each node. The pricing policy used depends on a weighted average queue length at each node. This helps in reducing frequent price variations and is in the spirit of the random early detection (RED) mechanism used in TCP/IP networks. We observe in our numerical results a considerable improvement in performance using our scheme over that of a recently proposed related scheme in terms of both throughput and delay performance. In particular, our approach exhibits a throughput improvement in the range of 34 to 69 percent in all cases studied (over all routes) over the above scheme.
Resumo:
We consider an optimal power and rate scheduling problem for a multiaccess fading wireless channel with the objective of minimising a weighted sum of mean packet transmission delay subject to a peak power constraint. The base station acts as a controller which, depending upon the buffer lengths and the channel state of each user, allocates transmission rate and power to individual users. We assume perfect channel state information at the transmitter and the receiver. We also assume a Markov model for the fading and packet arrival processes. The policy obtained represents a form of Indexability.
Resumo:
The problem of admission control of packets in communication networks is studied in the continuous time queueing framework under different classes of service and delayed information feedback. We develop and use a variant of a simulation based two timescale simultaneous perturbation stochastic approximation (SPSA) algorithm for finding an optimal feedback policy within the class of threshold type policies. Even though SPSA has originally been designed for continuous parameter optimization, its variant for the discrete parameter case is seen to work well. We give a proof of the hypothesis needed to show convergence of the algorithm on our setting along with a sketch of the convergence analysis. Extensive numerical experiments with the algorithm are illustrated for different parameter specifications. In particular, we study the effect of feedback delays on the system performance.