921 resultados para Differentiable dynamical systems
Resumo:
There has been revival of interest in Jerky flow from the point of view of dynamical systems. The earliest attempt in this direction was from our group. One of the predictions of the theory is that Jerky flow could be chaotic. This has been recently verified by us. We have recently extended the earlier model to account for the spatial aspect as well. Both these models are in the form of coupled set of nonlinear differential equations and hence, they are complicated in their structure. For this reason we wish to devise a model based on the results of these two theories in the form of coupled lattice map for the description of the formation and propagation of dislocation bands. We report here one such model and its results.
Resumo:
We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values of states with actions chosen according to the ‘current ’ randomized policy updates. A sketch of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms for routing on different network topologies are presented and performance comparisons with the regular Q-learning algorithm are shown.
Resumo:
In this paper, we address a key problem faced by advertisers in sponsored search auctions on the web: how much to bid, given the bids of the other advertisers, so as to maximize individual payoffs? Assuming the generalized second price auction as the auction mechanism, we formulate this problem in the framework of an infinite horizon alternative-move game of advertiser bidding behavior. For a sponsored search auction involving two advertisers, we characterize all the pure strategy and mixed strategy Nash equilibria. We also prove that the bid prices will lead to a Nash equilibrium, if the advertisers follow a myopic best response bidding strategy. Following this, we investigate the bidding behavior of the advertisers if they use Q-learning. We discover empirically an interesting trend that the Q-values converge even if both the advertisers learn simultaneously.
Resumo:
In a computational grid, the presence of grid resource providers who are rational and intelligent could lead to an overall degradation in the efficiency of the grid. In this paper, we design incentive compatible grid resource procurement mechanisms which ensure that the efficiency of the grid is not affected by the rational behavior of resource providers.In particular, we offer three elegant incentive compatible mechanisms for this purpose: (1) G-DSIC (Grid-Dominant Strategy Incentive Compatible) mechanism (2) G-BIC (Grid-Bayesian Nash Incentive Compatible) mechanism (3) G-OPT(Grid-Optimal) mechanism which minimizes the cost to the grid user, satisfying at the same time, (a) Bayesian incentive compatibility and (b) individual rationality. We evaluate the relative merits and demerits of the above three mechanisms using game theoretical analysis and numerical experiments.
Resumo:
This paper addresses the problem of multiagent search in an unknown environment. The agents are autonomous in nature and are equipped with necessary sensors to carry out the search operation. The uncertainty, or lack of information about the search area is known a priori as a probability density function. The agents are deployed in an optimal way so as to maximize the one step uncertainty reduction. The agents continue to deploy themselves and reduce uncertainty till the uncertainty density is reduced over the search space below a minimum acceptable level. It has been shown, using LaSalle’s invariance principle, that a distributed control law which moves each of the agents towards the centroid of its Voronoi partition, modified by the sensor range leads to single step optimal deployment. This principle is now used to devise search trajectories for the agents. The simulations were carried out in 2D space with saturation on speeds of the agents. The results show that the control strategy per step indeed moves the agents to the respective centroid and the algorithm reduces the uncertainty distribution to the required level within a few steps.
Resumo:
Based on dynamic inversion, a relatively straightforward approach is presented in this paper for nonlinear flight control design of high performance aircrafts, which does not require the normal and lateral acceleration commands to be first transferred to body rates before computing the required control inputs. This leads to substantial improvement of the tracking response. Promising results are obtained from six degree-offreedom simulation studies of F-16 aircraft, which are found to be superior as compared to an existing approach (which is also based on dynamic inversion). The new approach has two potential benefits, namely reduced oscillatory response (including elimination of non-minimum phase behavior) and reduced control magnitude. Next, a model-following neuron-adaptive design is augmented the nominal design in order to assure robust performance in the presence of parameter inaccuracies in the model. Note that in the approach the model update takes place adaptively online and hence it is philosophically similar to indirect adaptive control. However, unlike a typical indirect adaptive control approach, there is no need to update the individual parameters explicitly. Instead the inaccuracy in the system output dynamics is captured directly and then used in modifying the control. This leads to faster adaptation, which helps in stabilizing the unstable plant quicker. The robustness study from a large number of simulations shows that the adaptive design has good amount of robustness with respect to the expected parameter inaccuracies in the model.