2 resultados para ROBOTICS
em Repositório Institucional da Universidade de Aveiro - Portugal
Resumo:
When developing software for autonomous mobile robots, one has to inevitably tackle some kind of perception. Moreover, when dealing with agents that possess some level of reasoning for executing their actions, there is the need to model the environment and the robot internal state in a way that it represents the scenario in which the robot operates. Inserted in the ATRI group, part of the IEETA research unit at Aveiro University, this work uses two of the projects of the group as test bed, particularly in the scenario of robotic soccer with real robots. With the main objective of developing algorithms for sensor and information fusion that could be used e ectively on these teams, several state of the art approaches were studied, implemented and adapted to each of the robot types. Within the MSL RoboCup team CAMBADA, the main focus was the perception of ball and obstacles, with the creation of models capable of providing extended information so that the reasoning of the robot can be ever more e ective. To achieve it, several methodologies were analyzed, implemented, compared and improved. Concerning the ball, an analysis of ltering methodologies for stabilization of its position and estimation of its velocity was performed. Also, with the goal keeper in mind, work has been done to provide it with information of aerial balls. As for obstacles, a new de nition of the way they are perceived by the vision and the type of information provided was created, as well as a methodology for identifying which of the obstacles are team mates. Also, a tracking algorithm was developed, which ultimately assigned each of the obstacles a unique identi er. Associated with the improvement of the obstacles perception, a new algorithm of estimating reactive obstacle avoidance was created. In the context of the SPL RoboCup team Portuguese Team, besides the inevitable adaptation of many of the algorithms already developed for sensor and information fusion and considering that it was recently created, the objective was to create a sustainable software architecture that could be the base for future modular development. The software architecture created is based on a series of di erent processes and the means of communication among them. All processes were created or adapted for the new architecture and a base set of roles and behaviors was de ned during this work to achieve a base functional framework. In terms of perception, the main focus was to de ne a projection model and camera pose extraction that could provide information in metric coordinates. The second main objective was to adapt the CAMBADA localization algorithm to work on the NAO robots, considering all the limitations it presents when comparing to the MSL team, especially in terms of computational resources. A set of support tools were developed or improved in order to support the test and development in both teams. In general, the work developed during this thesis improved the performance of the teams during play and also the e ectiveness of the developers team when in development and test phases.
Resumo:
This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take advantage of the underlying episodic structure of collected batch of interactions. The Q-Batch update-rule is proposed in this thesis, to process experiencies along the trajectories collected in the interaction phase. This allows a faster propagation of obtained rewards and penalties, resulting in faster and more robust learning. Non-parametric function approximations are explored, such as Gaussian Processes. This type of approximators allows to encode prior knowledge about the latent function, in the form of kernels, providing a higher level of exibility and accuracy. The application of Gaussian Processes in Batch Reinforcement Learning presented a higher performance in learning tasks than other function approximations used in the literature. Lastly, in order to extract more information from the experiences collected by the agent, model-learning techniques are incorporated to learn the system dynamics. In this way, it is possible to augment the set of collected experiences with experiences generated through planning using the learned models. Experiments were carried out mainly in simulation, with some tests carried out in a physical robotic platform. The obtained results show that the proposed approaches are able to outperform the classical Fitted Q Iteration.