145 resultados para Asymptotically optimal policy
Resumo:
In an underlay cognitive radio (CR) system, a secondary user can transmit when the primary is transmitting but is subject to tight constraints on the interference it causes to the primary receiver. Amplify-and-forward (AF) relaying is an effective technique that significantly improves the performance of a CR by providing an alternate path for the secondary transmitter's signal to reach the secondary receiver. We present and analyze a novel optimal relay gain adaptation policy (ORGAP) in which the relay is interference aware and optimally adapts both its gain and transmit power as a function of its local channel gains. ORGAP minimizes the symbol error probability at the secondary receiver subject to constraints on the average relay transmit power and on the average interference caused to the primary. It is different from ad hoc AF relaying policies and serves as a new and fundamental theoretical benchmark for relaying in an underlay CR. We also develop a near-optimal and simpler relay gain adaptation policy that is easy to implement. An extension to a multirelay scenario with selection is also developed. Our extensive numerical results for single and multiple relay systems quantify the power savings achieved over several ad hoc policies for both MPSK and MQAM constellations.
Resumo:
We apply the objective method of Aldous to the problem of finding the minimum-cost edge cover of the complete graph with random independent and identically distributed edge costs. The limit, as the number of vertices goes to infinity, of the expected minimum cost for this problem is known via a combinatorial approach of Hessler and Wastlund. We provide a proof of this result using the machinery of the objective method and local weak convergence, which was used to prove the (2) limit of the random assignment problem. A proof via the objective method is useful because it provides us with more information on the nature of the edge's incident on a typical root in the minimum-cost edge cover. We further show that a belief propagation algorithm converges asymptotically to the optimal solution. This can be applied in a computational linguistics problem of semantic projection. The belief propagation algorithm yields a near optimal solution with lesser complexity than the known best algorithms designed for optimality in worst-case settings.
Resumo:
We examine the effect of subdividing the potential barrier along the reaction coordinate on Kramers' escape rate for a model potential. Using the known supersymmetric potential approach, we show the existence of an optimal number of subdivisions that maximizes the rate.
Resumo:
The problem of optimal scheduling of the generation of a hydro-thermal power system that is faced with a shortage of energy is studied. The deterministic version of the problem is first analyzed, and the results are then extended to cases where the loads and the hydro inflows are random variables.
Resumo:
This paper deals with the optimal load flow problem in a fixed-head hydrothermal electric power system. Equality constraints on the volume of water available for active power generation at the hydro plants as well as inequality constraints on the reactive power generation at the voltage controlled buses are imposed. Conditions for optimal load flow are derived and a successive approximation algorithm for solving the optimal generation schedule is developed. Computer implementation of the algorithm is discussed, and the results obtained from the computer solution of test systems are presented.
Resumo:
Systems of learning automata have been studied by various researchers to evolve useful strategies for decision making under uncertainity. Considered in this paper are a class of hierarchical systems of learning automata where the system gets responses from its environment at each level of the hierarchy. A classification of such sequential learning tasks based on the complexity of the learning problem is presented. It is shown that none of the existing algorithms can perform in the most general type of hierarchical problem. An algorithm for learning the globally optimal path in this general setting is presented, and its convergence is established. This algorithm needs information transfer from the lower levels to the higher levels. Using the methodology of estimator algorithms, this model can be generalized to accommodate other kinds of hierarchical learning tasks.
Resumo:
We consider the problem of estimating the optimal parameter trajectory over a finite time interval in a parameterized stochastic differential equation (SDE), and propose a simulation-based algorithm for this purpose. Towards this end, we consider a discretization of the SDE over finite time instants and reformulate the problem as one of finding an optimal parameter at each of these instants. A stochastic approximation algorithm based on the smoothed functional technique is adapted to this setting for finding the optimal parameter trajectory. A proof of convergence of the algorithm is presented and results of numerical experiments over two different settings are shown. The algorithm is seen to exhibit good performance. We also present extensions of our framework to the case of finding optimal parameterized feedback policies for controlled SDE and present numerical results in this scenario as well.
Resumo:
In this paper, we consider the bi-criteria single machine scheduling problem of n jobs with a learning effect. The two objectives considered are the total completion time (TC) and total absolute differences in completion times (TADC). The objective is to find a sequence that performs well with respect to both the objectives: the total completion time and the total absolute differences in completion times. In an earlier study, a method of solving bi-criteria transportation problem is presented. In this paper, we use the methodology of solvin bi-criteria transportation problem, to our bi-criteria single machine scheduling problem with a learning effect, and obtain the set of optimal sequences,. Numerical examples are presented for illustrating the applicability and ease of understanding.
Resumo:
In a letter RauA proposed a new method for designing statefeedback controllers using eigenvalue sensitivity matrices. However, there appears to be a conceptual mistake in the procedure, or else it is unduly restricted in its applicability. In particular the equation — BR~lBTK = A/.I, in which K is a positive-definite symmetric matrix.
Resumo:
Cooperative relay communication in a fading channel environment under the orthogonal amplify-and-forward (OAF), nonorthogonal and orthogonal selection decode-and-forward (NSDF and OSDF) protocols is considered here. The diversity-multiplexing gain tradeoff (DMT) of the three protocols is determined and DMT-optimal distributed space-time (ST) code constructions are provided. The codes constructed are sphere decodable and in some instances incur minimum possible delay. Included in our results is the perhaps surprising finding that the orthogonal and the nonorthogonal amplify-and-forward (NAF) protocols have identical DMT when the time durations of the broadcast and cooperative phases are optimally chosen to suit the respective protocol. Moreover our code construction for the OAF protocol incurs less delay. Two variants of the NSDF protocol are considered: fixed-NSDF and variable-NSDF protocol. In the variable-NSDF protocol, the fraction of time occupied by the broadcast phase is allowed to vary with multiplexing gain. The variable-NSDF protocol is shown to improve on the DMT of the best previously known static protocol when the number of relays is greater than two. Also included is a DMT optimal code construction for the NAF protocol.
Resumo:
A learning automaton operating in a random environment updates its action probabilities on the basis of the reactions of the environment, so that asymptotically it chooses the optimal action. When the number of actions is large the automaton becomes slow because there are too many updatings to be made at each instant. A hierarchical system of such automata with assured c-optimality is suggested to overcome that problem.The learning algorithm for the hierarchical system turns out to be a simple modification of the absolutely expedient algorithm known in the literature. The parameters of the algorithm at each level in the hierarchy depend only on the parameters and the action probabilities of the previous level. It follows that to minimize the number of updatings per cycle each automaton in the hierarchy need have only two or three actions.
Resumo:
The problem of learning correct decision rules to minimize the probability of misclassification is a long-standing problem of supervised learning in pattern recognition. The problem of learning such optimal discriminant functions is considered for the class of problems where the statistical properties of the pattern classes are completely unknown. The problem is posed as a game with common payoff played by a team of mutually cooperating learning automata. This essentially results in a probabilistic search through the space of classifiers. The approach is inherently capable of learning discriminant functions that are nonlinear in their parameters also. A learning algorithm is presented for the team and convergence is established. It is proved that the team can obtain the optimal classifier to an arbitrary approximation. Simulation results with a few examples are presented where the team learns the optimal classifier.
Resumo:
A closed-loop steering logic based on an optimal (2-guidance is developed here. The guidance system drives the satellite launch vehicle along a two- or three- dimensional trajectory for placing the payload into a specified circular orbit. The modified g-guidance algorithm makes use of the optimal required velocity vector, which minimizes the total impulse needed for an equivalent two-impluse transfer from the present state to the final orbit. The required velocity vector is defined as velocity of the vehicle on the hypothetical transfer orbit immediately after the application of the first impulse. For this optimal transfer orbit, a simple and elegant expression for the Q-matrix is derived. A working principle for the guidance algorithm in terms of the major and minor cycles, and also for the generation of the steering command, is outlined.
Resumo:
This paper proposes a novel application of differential evolution to solve a difficult dynamic optimisation or optimal control problem. The miss distance in a missile-target engagement is minimised using differential evolution. The difficulty of solving it by existing conventional techniques in optimal control theory is caused by the nonlinearity of the dynamic constraint equation, inequality constraint on the control input and inequality constraint on another parameter that enters problem indirectly. The optimal control problem of finding the minimum miss distance has an analytical solution subject to several simplifying assumptions. In the approach proposed in this paper, the initial population is generated around the seed value given by this analytical solution. Thereafter, the algorithm progresses to an acceptable final solution within a few generations, satisfying the constraints at every iteration. Since this solution or the control input has to be obtained in real time to be of any use in practice, the feasibility of online implementation is also illustrated.
Resumo:
We study sensor networks with energy harvesting nodes. The generated energy at a node can be stored in a buffer. A sensor node periodically senses a random field and generates a packet. These packets are stored in a queue and transmitted using the energy available at that time at the node. For such networks we develop efficient energy management policies. First, for a single node, we obtain policies that are throughput optimal, i.e., the data queue stays stable for the largest possible data rate. Next we obtain energy management policies which minimize the mean delay in the queue. We also compare performance of several easily implementable suboptimal policies. A greedy policy is identified which, in low SNR regime, is throughput optimal and also minimizes mean delay. Next using the results for a single node, we develop efficient MAC policies.