5 resultados para wireless mobile robots
em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco
Resumo:
215 p.
Resumo:
Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.
Resumo:
Enhancing the handover process in broadband wireless communication deployment has traditionally motivated many research initiatives. In a high-speed railway domain, the challenge is even greater. Owing to the long distances covered, the mobile node gets involved in a compulsory sequence of handover processes. Consequently, poor performance during the execution of these handover processes significantly degrades the global end-to-end performance. This article proposes a new handover strategy for the railway domain: the RMPA handover, a Reliable Mobility Pattern Aware IEEE 802.16 handover strategy "customized" for a high-speed mobility scenario. The stringent high mobility feature is balanced with three other positive features in a high-speed context: mobility pattern awareness, different sources for location discovery techniques, and a previously known traffic data profile. To the best of the authors' knowledge, there is no IEEE 802.16 handover scheme that simultaneously covers the optimization of the handover process itself and the efficient timing of the handover process. Our strategy covers both areas of research while providing a cost-effective and standards-based solution. To schedule the handover process efficiently, the RMPA strategy makes use of a context aware handover policy; that is, a handover policy based on the mobile node mobility pattern, the time required to perform the handover, the neighboring network conditions, the data traffic profile, the received power signal, and current location and speed information of the train. Our proposal merges all these variables in a cross layer interaction in the handover policy engine. It also enhances the handover process itself by establishing the values for the set of handover configuration parameters and mechanisms of the handover process. RMPA is a cost-effective strategy because compatibility with standards-based equipment is guaranteed. The major contributions of the RMPA handover are in areas that have been left open to the handover designer's discretion. Our simulation analysis validates the RMPA handover decision rules and design choices. Our results supporting a high-demand video application in the uplink stream show a significant improvement in the end-to-end quality of service parameters, including end-to-end delay (22%) and jitter (80%), when compared with a policy based on signal-to-noise-ratio information.
Resumo:
221 p.
Resumo:
110 p.