8 resultados para Simulated robots
em Universitat de Girona, Spain
Resumo:
In this paper we present a novel approach to assigning roles to robots in a team of physical heterogeneous robots. Its members compete for these roles and get rewards for them. The rewards are used to determine each agent’s preferences and which agents are better adapted to the environment. These aspects are included in the decision making process. Agent interactions are modelled using the concept of an ecosystem in which each robot is a species, resulting in emergent behaviour of the whole set of agents. One of the most important features of this approach is its high adaptability. Unlike some other learning techniques, this approach does not need to start a whole exploitation process when the environment changes. All this is exemplified by means of experiments run on a simulator. In addition, the algorithm developed was applied as applied to several teams of robots in order to analyse the impact of heterogeneity in these systems
Resumo:
Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior
Resumo:
This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV
Resumo:
Autonomous underwater vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of subsea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of reinforcement learning direct policy search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
This work extends a previously developed research concerning about the use of local model predictive control in differential driven mobile robots. Hence, experimental results are presented as a way to improve the methodology by considering aspects as trajectory accuracy and time performance. In this sense, the cost function and the prediction horizon are important aspects to be considered. The aim of the present work is to test the control method by measuring trajectory tracking accuracy and time performance. Moreover, strategies for the integration with perception system and path planning are briefly introduced. In this sense, monocular image data can be used to plan safety trajectories by using goal attraction potential fields
Resumo:
This paper presents the distributed environment for virtual and/or real experiments for underwater robots (DEVRE). This environment is composed of a set of processes running on a local area network composed of three sites: 1) the onboard AUV computer; 2) a surface computer used as human-machine interface (HMI); and 3) a computer used for simulating the vehicle dynamics and representing the virtual world. The HMI can be transparently linked to the real sensors and actuators dealing with a real mission. It can also be linked with virtual sensors and virtual actuators, dealing with a virtual mission. The aim of DEVRE is to assist engineers during the software development and testing in the lab prior to real experiments
Resumo:
Aquesta tesi tracta el problema del posicionament de robots mòbils quan, en el decurs del moviment, es realitzen mesures angulars relatives al robot de l'orientació de la recta entre un dels seus punts i punts de l'entorn de posició coneguda. Es considera que les mesures angulars són fetes per un sensor làser giratori que detecta diferents reflectors catadiòptrics fixos. La contribució principal és el desenvolupament d'un algorisme dinàmic, basat en un filtre de Kalman estès (EKF), que estima a cada instant de temps l'estat format pels angles associats als reflectors. La simulació hodomètrica dels angles entre mesures directes del sensor làser garanteix l'ús consistent i continuat dels mètodes de triangulació per a determinar la posició i l'orientació del robot. Inclou simulacions informàtiques i experiments per a validar la precisió del mètode de posicionament proposat. En l'experimentació s'utilitza un robot mòbil omnidireccional amb tres rodes de lliscament direccional de corrons esfèrics.