967 resultados para Asymptotically optimal policy
Resumo:
Wireless adhoc networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem - the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node. In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels, where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information (CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a c- - orresponding factored class of control poli.
Resumo:
A construction for a family of sequences over the 8-ary AM-PSK constellation that has maximum nontrivial correlation magnitude bounded as theta(max) less than or similar to root N is presented here. The famfly is asymptotically optimal with respect to the Welch bound on maximum magnitude of correlation. The 8-ary AM-PSK constellation is a subset of the 16-QAM constellation. We also construct two families of sequences over 16-QAM with theta(max) less than or similar to root 2 root N. These families are constructed by interleaving sets of sequences. A construction for a famBy of low-correlation sequences over QAM alphabet of size 2(2m) is presented with maximum nontrivial normalized correlation parameter bounded above by less than or similar to a root N, where N is the period of the sequences in the family and where a ranges from 1.61 in the case of 16-QAM modulation to 2.76 for large m. When used in a CDMA setting, the family will permit each user to modulate the code sequence with 2m bits of data. Interestingly, the construction permits users on the reverse link of the CDMA channel to communicate using varying data rates by switching between sequence famflies; associated to different values of the parameter m. Other features of the sequence families are improved Euclidean distance between different data symbols in comparison with PSK signaling and compatibility of the QAM sequence families with sequences belonging to the large quaternary sequence families {S(p)}.
Resumo:
In this paper, we study the behaviour of the slotted Aloha multiple access scheme with a finite number of users under different traffic loads and optimize the retransmission probability q(r) for various settings, cost objectives and policies. First, we formulate the problem as a parameter optimization problem and use certain efficient smoothed functional algorithms for finding the optimal retransmission probability parameter. Next, we propose two classes of multi-level closed-loop feedback policies (for finding in each case the retransmission probability qr that now depends on the current system state) and apply the above algorithms for finding an optimal policy within each class of policies. While one of the policy classes depends on the number of backlogged nodes in the system, the other depends on the number of time slots since the last successful transmission. The latter policies are more realistic as it is difficult to keep track of the number of backlogged nodes at each instant. We investigate the effect of increasing the number of levels in the feedback policies. Wen also investigate the effects of using different cost functions (withn and without penalization) in our algorithms and the corresponding change in the throughput and delay using these. Both of our algorithms use two-timescale stochastic approximation. One of the algorithms uses one simulation while the other uses two simulations of the system. The two-simulation algorithm is seen to perform better than the other algorithm. Optimal multi-level closed-loop policies are seen to perform better than optimal open-loop policies. The performance further improves when more levels are used in the feedback policies.
Resumo:
This thesis studies quantile residuals and uses different methodologies to develop test statistics that are applicable in evaluating linear and nonlinear time series models based on continuous distributions. Models based on mixtures of distributions are of special interest because it turns out that for those models traditional residuals, often referred to as Pearson's residuals, are not appropriate. As such models have become more and more popular in practice, especially with financial time series data there is a need for reliable diagnostic tools that can be used to evaluate them. The aim of the thesis is to show how such diagnostic tools can be obtained and used in model evaluation. The quantile residuals considered here are defined in such a way that, when the model is correctly specified and its parameters are consistently estimated, they are approximately independent with standard normal distribution. All the tests derived in the thesis are pure significance type tests and are theoretically sound in that they properly take the uncertainty caused by parameter estimation into account. -- In Chapter 2 a general framework based on the likelihood function and smooth functions of univariate quantile residuals is derived that can be used to obtain misspecification tests for various purposes. Three easy-to-use tests aimed at detecting non-normality, autocorrelation, and conditional heteroscedasticity in quantile residuals are formulated. It also turns out that these tests can be interpreted as Lagrange Multiplier or score tests so that they are asymptotically optimal against local alternatives. Chapter 3 extends the concept of quantile residuals to multivariate models. The framework of Chapter 2 is generalized and tests aimed at detecting non-normality, serial correlation, and conditional heteroscedasticity in multivariate quantile residuals are derived based on it. Score test interpretations are obtained for the serial correlation and conditional heteroscedasticity tests and in a rather restricted special case for the normality test. In Chapter 4 the tests are constructed using the empirical distribution function of quantile residuals. So-called Khmaladze s martingale transformation is applied in order to eliminate the uncertainty caused by parameter estimation. Various test statistics are considered so that critical bounds for histogram type plots as well as Quantile-Quantile and Probability-Probability type plots of quantile residuals are obtained. Chapters 2, 3, and 4 contain simulations and empirical examples which illustrate the finite sample size and power properties of the derived tests and also how the tests and related graphical tools based on residuals are applied in practice.
Resumo:
Design of speaker identification schemes for a small number of speakers (around 10) with a high degree of accuracy in controlled environment is a practical proposition today. When the number of speakers is large (say 50–100), many of these schemes cannot be directly extended, as both recognition error and computation time increase monotonically with population size. The feature selection problem is also complex for such schemes. Though there were earlier attempts to rank order features based on statistical distance measures, it has been observed only recently that the best two independent measurements are not the same as the combination in two's for pattern classification. We propose here a systematic approach to the problem using the decision tree or hierarchical classifier with the following objectives: (1) Design of optimal policy at each node of the tree given the tree structure i.e., the tree skeleton and the features to be used at each node. (2) Determination of the optimal feature measurement and decision policy given only the tree skeleton. Applicability of optimization procedures such as dynamic programming in the design of such trees is studied. The experimental results deal with the design of a 50 speaker identification scheme based on this approach.
Resumo:
Tax havens have attracted increasing attention from the authorities of non-haven countries. The financial crisis exacerbates the negative attitude to tax havens. Offshore zones are now under strong pressure from the international, both financial and political institutions. Thus, the thesis will focus on the current problem of the modern economy, namely tax havens and their impact on the non-haven countries. This thesis will be based on the several articles, in particular “Tax Competition With Parasitic Tax Havens” by Joel Slemrod and John D. Wilson (University of Michigan, 2009) and “Do Havens Divert Economic Activity” by James R. Hines Jr., C. Fritz Foley and Mihir A. Desai (Ross School of Business, 2005). This paper provides two completely different and contradictory viewpoints on the problem of coexisting tax havens and non-haven countries. There are two models, examined in this work, present two important researches. The first one will be concentrated on the positive effect from tax havens whereas the last model will be focused on the completely negative effect from offshore jurisdictions. The first model gives us a good explanation and proof of its statement why tax havens can positively influence on nearby high-tax countries. It describes that the existence of offshore jurisdictions can stimulate the growth of operations and facilitates economic activity in non-haven countries. In contrast to above mentioned, the model with quite opposite view was presented. This economic model and its analysis confirms the undesirability of the existence of offshore areas. Taking into consideration, that the jurisdictions choose their optimal policy, the elimination of offshores will have positive impact on the rest of countries. The model proofs the statement that full or partial elimination of tax havens raises the equilibrium level of the public good and increases country welfare. According to the following study, it can be concluded that both of the models provide telling arguments to prove their assertions. Thereby both of these points of view have their right to exist. Nevertheless, the ongoing debate concerning this issue still will raise a lot of questions.
Resumo:
Tutkielman tarkoituksena on rakentaa malli, jolla voidaan arvioida maitotilan toimintaa monipuolisesti ravinnehuuhtouman kannalta. Tavoitteena on ennen muuta analyyttinen tarkastelu siten, että kuitenkin huomioiden maidontuotannon ominaispiirteet, keskeisinä esimerkkeinä ravinnon, maidontuotannon ja lannan ominaisuuksien väliset riippuvuudet. Analyysissa tarkastellaan tilanomistajan yksityisen hyödynmaksimoinnin ja yhteiskunnan tavoitteiden eroavaisuutta. Lisäksi johdetaan optimaaliset ohjauskeinot ja arvioidaan eräiden yksinkertaisten ohjauskeinojen vaikuttavuutta. Mallin analysoinnin perusteella yksityinen ja yhteiskunnallinen optimiratkaisu eroavat toisistaan kaikkien päätösmuuttujien osalta. Ei voida kuitenkaan yleispätevästi sanoa muutosten merkittävyyttä tai suuntaa ottamatta kantaa mallin funktiomuotoihin ja parametriarvoihin. Optimaaliset ohjauskeinot tulisi asettaa keinolannoitteen ja lannan levitykselle, väkirehuruokinnalle ja säilörehun viljelylle, mutta ei eläinten määrälle. Toisaalta optimaaliset ohjauskeinot ovat hyvin monimutkaisia. Tarkasteltujen yksinkertaisempien ohjauskeinojen vaikuttavuutta ei voida tarkasti arvioida analyyttisella tasolla. Numeeristen tulosten perusteella yhteiskunnallisessa optimissa eläinmäärä olisi hieman yksityistä optimia pienempi, väkirehun käyttö vähäisempää, enemmän peltoa allokoitaisiin säilörehun viljelyyn ja lannoitustasot olisivat kautta linjan hieman pienemmät. Lannanlevitykset eroavat etäisyyden suhteen: molemmissa tapauksissa lannanlevityksen intensiteetti kasvaa kohti tilan keskusta kuljettaessa, mutta yksityisessä optimissa lähimmälle pellolle dumpataan kaikki ylimääräinen lanta, kun taas yhteiskunnallisessa optimissa lanta levitetään tasaisemmin eri pelloille. Ero kokonaishyvinvoinnissa jää pieneksi, mutta eläinmäärän kasvu kärjistäisi lannan dumppauksen aiheuttamia huuhtoumahaittoja. Yksinkertaisista ohjauskeinoista lannoitusrajoitus sekä ravinne- ja keinolannoitevero osoittautuivat kohtalaisen käyttökelpoisiksi numeeristen tulosten perusteella. Sen sijaan eläinmäärän rajoittaminen ja lannan kuljetuskustannuksia kompensoivat tuet vaikuttavat tulosten perusteella huonoilta ratkaisuilta.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Wireless networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem – the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node.In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels,where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information(CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a corresponding factored class of control policies corresponding to local cooperation between nodes with a local outage constraint. The resulting optimal scheme in this class can again be computed efficiently in a decentralized manner. We demonstrate significant energy savings for both slow and fast fading channels through numerical simulations of randomly distributed networks.
Resumo:
We consider the problem of scheduling of a wireless channel (server) to several queues. Each queue has its own link (transmission) rate. The link rate of a queue can vary randomly from slot to slot. The queue lengths and channel states of all users are known at the beginning of each slot. We show the existence of an optimal policy that minimizes the long term (discounted) average sum of queue lengths. The optimal policy, in general needs to be computed numerically. Then we identify a greedy (one step optimal) policy, MAX-TRANS which is easy to implement and does not require the channel and traffic statistics. The cost of this policy is close to optimal and better than other well-known policies (when stable) although it is not throughput optimal for asymmetric systems. We (approximately) identify its stability region and obtain approximations for its mean queue lengths and mean delays. We also modify this policy to make it throughput optimal while retaining good performance.
Resumo:
We consider a problem of providing mean delay and average throughput guarantees in random access fading wireless channels using CSMA/CA algorithm. This problem becomes much more challenging when the scheduling is distributed as is the case in a typical local area wireless network. We model the CSMA network using a novel queueing network based approach. The optimal throughput per device and throughput optimal policy in an M device network is obtained. We provide a simple contention control algorithm that adapts the attempt probability based on the network load and obtain bounds for the packet transmission delay. The information we make use of is the number of devices in the network and the queue length (delayed) at each device. The proposed algorithms stay within the requirements of the IEEE 802.11 standard.
Resumo:
The standard quantum search algorithm lacks a feature, enjoyed by many classical algorithms, of having a fixed-point, i.e. a monotonic convergence towards the solution. Here we present two variations of the quantum search algorithm, which get around this limitation. The first replaces selective inversions in the algorithm by selective phase shifts of $\frac{\pi}{3}$. The second controls the selective inversion operations using two ancilla qubits, and irreversible measurement operations on the ancilla qubits drive the starting state towards the target state. Using $q$ oracle queries, these variations reduce the probability of finding a non-target state from $\epsilon$ to $\epsilon^{2q+1}$, which is asymptotically optimal. Similar ideas can lead to robust quantum algorithms, and provide conceptually new schemes for error correction.
Resumo:
Our work is motivated by geographical forwarding of sporadic alarm packets to a base station in a wireless sensor network (WSN), where the nodes are sleep-wake cycling periodically and asynchronously. We seek to develop local forwarding algorithms that can be tuned so as to tradeoff the end-to-end delay against a total cost, such as the hop count or total energy. Our approach is to solve, at each forwarding node enroute to the sink, the local forwarding problem of minimizing one-hop waiting delay subject to a lower bound constraint on a suitable reward offered by the next-hop relay; the constraint serves to tune the tradeoff. The reward metric used for the local problem is based on the end-to-end total cost objective (for instance, when the total cost is hop count, we choose to use the progress toward sink made by a relay as the reward). The forwarding node, to begin with, is uncertain about the number of relays, their wake-up times, and the reward values, but knows the probability distributions of these quantities. At each relay wake-up instant, when a relay reveals its reward value, the forwarding node's problem is to forward the packet or to wait for further relays to wake-up. In terms of the operations research literature, our work can be considered as a variant of the asset selling problem. We formulate our local forwarding problem as a partially observable Markov decision process (POMDP) and obtain inner and outer bounds for the optimal policy. Motivated by the computational complexity involved in the policies derived out of these bounds, we formulate an alternate simplified model, the optimal policy for which is a simple threshold rule. We provide simulation results to compare the performance of the inner and outer bound policies against the simple policy, and also against the optimal policy when the source knows the exact number of relays. Observing the good performance and the ease of implementation of the simple policy, we apply it to our motivating problem, i.e., local geographical routing of sporadic alarm packets in a large WSN. We compare the end-to-end performance (i.e., average total delay and average total cost) obtained by the simple policy, when used for local geographical forwarding, against that obtained by the globally optimal forwarding algorithm proposed by Kim et al. 1].
Resumo:
The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents, the problem of an efficient task allocation extends beyond that of solving an MDP and becomes that of designing a mechanism. Motivated by this fact, we propose a general mechanism which decides on an allocation rule for the tasks and resources and a payment rule to incentivize agents' participation and truthful reports. In contrast to related dynamic strategic control problems studied in recent literature, the problem studied here has interdependent values: the benefit of an allocation to the task owner is not simply a function of the characteristics of the task itself and the allocation, but also of the state of the resources. We introduce a dynamic extension of Mezzetti's two phase mechanism for interdependent valuations. In this changed setting, the proposed dynamic mechanism is efficient, within period ex-post incentive compatible, and within period ex-post individually rational.
Resumo:
Channel-aware assignment of sub-channels to users in the downlink of an OFDMA system demands extensive feedback of channel state information (CSI) to the base station. Since the feedback bandwidth is often very scarce, schemes that limit feedback are necessary. We develop a novel, low feedback splitting-based algorithm for assigning each sub-channel to its best user, i.e., the user with the highest gain for that sub-channel among all users. The key idea behind the algorithm is that, at any time, each user contends for the sub-channel on which it has the largest channel gain among the unallocated sub-channels. Unlike other existing schemes, the algorithm explicitly handles multiple access control aspects associated with the feedback of CSI. A tractable asymptotic analysis of a system with a large number of users helps design the algorithm. It yields 50% to 65% throughput gains compared to an asymptotically optimal one-bit feedback scheme, when the number of users is as small as 10 or as large as 1000. The algorithm is fast and distributed, and scales with the number of users.