145 resultados para Asymptotically optimal policy
Resumo:
In this paper we are concerned with finding the maximum throughput that a mobile ad hoc network can support. Even when nodes are stationary, the problem of determining the capacity region has long been known to be NP-hard. Mobility introduces an additional dimension of complexity because nodes now also have to decide when they should initiate route discovery. Since route discovery involves communication and computation overhead, it should not be invoked very often. On the other hand, mobility implies that routes are bound to become stale resulting in sub-optimal performance if routes are not updated. We attempt to gain some understanding of these effects by considering a simple one-dimensional network model. The simplicity of our model allows us to use stochastic dynamic programming (SDP) to find the maximum possible network throughput with ideal routing and medium access control (MAC) scheduling. Using the optimal value as a benchmark, we also propose and evaluate the performance of a simple threshold-based heuristic. Unlike the optimal policy which requires considerable state information, the heuristic is very simple to implement and is not overly sensitive to the threshold value used. We find empirical conditions for our heuristic to be near-optimal as well as network scenarios when our simple heuristic does not perform very well. We provide extensive numerical and simulation results for different parameter settings of our model.
Resumo:
We study the problem of optimal bandwidth allocation in communication networks. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length at the buffers is observed every T instants of time, on the basis of which a decision on the amount of bandwidth to be allocated to each buffer for the next T instants is made. We consider a class of closed-loop feedback policies for the system and use a twotimescale simultaneous perturbation stochastic approximation(SPSA) algorithm to find an optimal policy within the prescribed class. We study the performance of the proposed algorithm on a numerical setting. Our algorithm is found to exhibit good performance.
Resumo:
We consider a visual search problem studied by Sripati and Olson where the objective is to identify an oddball image embedded among multiple distractor images as quickly as possible. We model this visual search task as an active sequential hypothesis testing problem (ASHT problem). Chernoff in 1959 proposed a policy in which the expected delay to decision is asymptotically optimal. The asymptotics is under vanishing error probabilities. We first prove a stronger property on the moments of the delay until a decision, under the same asymptotics. Applying the result to the visual search problem, we then propose a ``neuronal metric'' on the measured neuronal responses that captures the discriminability between images. From empirical study we obtain a remarkable correlation (r = 0.90) between the proposed neuronal metric and speed of discrimination between the images. Although this correlation is lower than with the L-1 metric used by Sripati and Olson, this metric has the advantage of being firmly grounded in formal decision theory.
Resumo:
Amplify-and-forward (AF) relay based cooperation has been investigated in the literature given its simplicity and practicality. Two models for AF, namely, fixed gain and fixed power relaying, have been extensively studied. In fixed gain relaying, the relay gain is fixed but its transmit power varies as a function of the source-relay (SR) channel gain. In fixed power relaying, the relay's instantaneous transmit power is fixed, but its gain varies. We propose a general AF cooperation model in which an average transmit power constrained relay jointly adapts its gain and transmit power as a function of the channel gains. We derive the optimal AF gain policy that minimizes the fading- averaged symbol error probability (SEP) of MPSK and present insightful and tractable lower and upper bounds for it. We then analyze the SEP of the optimal policy. Our results show that the optimal scheme is up to 39.7% and 47.5% more energy-efficient than fixed power relaying and fixed gain relaying, respectively. Further, the weaker the direct source-destination link, the greater are the energy-efficiency gains.
Resumo:
We consider a power optimization problem with average delay constraint on the downlink of a Green Base-station. A Green Base-station is powered by both renewable energy such as solar or wind energy as well as conventional sources like diesel generators or the power grid. We try to minimize the energy drawn from conventional energy sources and utilize the harvested energy to the maximum extent. Each user also has an average delay constraint for its data. The optimal action consists of scheduling the users and allocating the optimal transmission rate for the chosen user. In this paper, we formulate the problem as a Markov Decision Problem and show the existence of a stationary average-cost optimal policy. We also derive some structural results for the optimal policy.
Resumo:
Transmit antenna selection (AS) is a popular, low hardware complexity technique that improves the performance of an underlay cognitive radio system, in which a secondary transmitter can transmit when the primary is on but under tight constraints on the interference it causes to the primary. The underlay interference constraint fundamentally changes the criterion used to select the antenna because the channel gains to the secondary and primary receivers must be both taken into account. We develop a novel and optimal joint AS and transmit power adaptation policy that minimizes a Chernoff upper bound on the symbol error probability (SEP) at the secondary receiver subject to an average transmit power constraint and an average primary interference constraint. Explicit expressions for the optimal antenna and power are provided in terms of the channel gains to the primary and secondary receivers. The SEP of the optimal policy is at least an order of magnitude lower than that achieved by several ad hoc selection rules proposed in the literature and even the optimal antenna selection rule for the case where the transmit power is either zero or a fixed value.
Resumo:
Our work is motivated by impromptu (or ``as-you-go'') deployment of wireless relay nodes along a path, a need that arises in many situations. In this paper, the path is modeled as starting at the origin (where there is the data sink, e.g., the control center), and evolving randomly over a lattice in the positive quadrant. A person walks along the path deploying relay nodes as he goes. At each step, the path can, randomly, either continue in the same direction or take a turn, or come to an end, at which point a data source (e.g., a sensor) has to be placed, that will send packets to the data sink. A decision has to be made at each step whether or not to place a wireless relay node. Assuming that the packet generation rate by the source is very low, and simple link-by-link scheduling, we consider the problem of sequential relay placement so as to minimize the expectation of an end-to-end cost metric (a linear combination of the sum of convex hop costs and the number of relays placed). This impromptu relay placement problem is formulated as a total cost Markov decision process. First, we derive the optimal policy in terms of an optimal placement set and show that this set is characterized by a boundary (with respect to the position of the last placed relay) beyond which it is optimal to place the next relay. Next, based on a simpler one-step-look-ahead characterization of the optimal policy, we propose an algorithm which is proved to converge to the optimal placement set in a finite number of steps and which is faster than value iteration. We show by simulations that the distance threshold based heuristic, usually assumed in the literature, is close to the optimal, provided that the threshold distance is carefully chosen. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
Wireless adhoc networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem - the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node. In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels, where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information (CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a c- - orresponding factored class of control poli.
Resumo:
A construction for a family of sequences over the 8-ary AM-PSK constellation that has maximum nontrivial correlation magnitude bounded as theta(max) less than or similar to root N is presented here. The famfly is asymptotically optimal with respect to the Welch bound on maximum magnitude of correlation. The 8-ary AM-PSK constellation is a subset of the 16-QAM constellation. We also construct two families of sequences over 16-QAM with theta(max) less than or similar to root 2 root N. These families are constructed by interleaving sets of sequences. A construction for a famBy of low-correlation sequences over QAM alphabet of size 2(2m) is presented with maximum nontrivial normalized correlation parameter bounded above by less than or similar to a root N, where N is the period of the sequences in the family and where a ranges from 1.61 in the case of 16-QAM modulation to 2.76 for large m. When used in a CDMA setting, the family will permit each user to modulate the code sequence with 2m bits of data. Interestingly, the construction permits users on the reverse link of the CDMA channel to communicate using varying data rates by switching between sequence famflies; associated to different values of the parameter m. Other features of the sequence families are improved Euclidean distance between different data symbols in comparison with PSK signaling and compatibility of the QAM sequence families with sequences belonging to the large quaternary sequence families {S(p)}.
Resumo:
In this paper, we study the behaviour of the slotted Aloha multiple access scheme with a finite number of users under different traffic loads and optimize the retransmission probability q(r) for various settings, cost objectives and policies. First, we formulate the problem as a parameter optimization problem and use certain efficient smoothed functional algorithms for finding the optimal retransmission probability parameter. Next, we propose two classes of multi-level closed-loop feedback policies (for finding in each case the retransmission probability qr that now depends on the current system state) and apply the above algorithms for finding an optimal policy within each class of policies. While one of the policy classes depends on the number of backlogged nodes in the system, the other depends on the number of time slots since the last successful transmission. The latter policies are more realistic as it is difficult to keep track of the number of backlogged nodes at each instant. We investigate the effect of increasing the number of levels in the feedback policies. Wen also investigate the effects of using different cost functions (withn and without penalization) in our algorithms and the corresponding change in the throughput and delay using these. Both of our algorithms use two-timescale stochastic approximation. One of the algorithms uses one simulation while the other uses two simulations of the system. The two-simulation algorithm is seen to perform better than the other algorithm. Optimal multi-level closed-loop policies are seen to perform better than optimal open-loop policies. The performance further improves when more levels are used in the feedback policies.
Resumo:
Design of speaker identification schemes for a small number of speakers (around 10) with a high degree of accuracy in controlled environment is a practical proposition today. When the number of speakers is large (say 50–100), many of these schemes cannot be directly extended, as both recognition error and computation time increase monotonically with population size. The feature selection problem is also complex for such schemes. Though there were earlier attempts to rank order features based on statistical distance measures, it has been observed only recently that the best two independent measurements are not the same as the combination in two's for pattern classification. We propose here a systematic approach to the problem using the decision tree or hierarchical classifier with the following objectives: (1) Design of optimal policy at each node of the tree given the tree structure i.e., the tree skeleton and the features to be used at each node. (2) Determination of the optimal feature measurement and decision policy given only the tree skeleton. Applicability of optimization procedures such as dynamic programming in the design of such trees is studied. The experimental results deal with the design of a 50 speaker identification scheme based on this approach.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Wireless networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem – the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node.In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels,where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information(CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a corresponding factored class of control policies corresponding to local cooperation between nodes with a local outage constraint. The resulting optimal scheme in this class can again be computed efficiently in a decentralized manner. We demonstrate significant energy savings for both slow and fast fading channels through numerical simulations of randomly distributed networks.
Resumo:
We consider the problem of scheduling of a wireless channel (server) to several queues. Each queue has its own link (transmission) rate. The link rate of a queue can vary randomly from slot to slot. The queue lengths and channel states of all users are known at the beginning of each slot. We show the existence of an optimal policy that minimizes the long term (discounted) average sum of queue lengths. The optimal policy, in general needs to be computed numerically. Then we identify a greedy (one step optimal) policy, MAX-TRANS which is easy to implement and does not require the channel and traffic statistics. The cost of this policy is close to optimal and better than other well-known policies (when stable) although it is not throughput optimal for asymmetric systems. We (approximately) identify its stability region and obtain approximations for its mean queue lengths and mean delays. We also modify this policy to make it throughput optimal while retaining good performance.
Resumo:
We consider a problem of providing mean delay and average throughput guarantees in random access fading wireless channels using CSMA/CA algorithm. This problem becomes much more challenging when the scheduling is distributed as is the case in a typical local area wireless network. We model the CSMA network using a novel queueing network based approach. The optimal throughput per device and throughput optimal policy in an M device network is obtained. We provide a simple contention control algorithm that adapts the attempt probability based on the network load and obtain bounds for the packet transmission delay. The information we make use of is the number of devices in the network and the queue length (delayed) at each device. The proposed algorithms stay within the requirements of the IEEE 802.11 standard.