899 resultados para learning robot control
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
In the future, robots will enter our everyday lives to help us with various tasks.For a complete integration and cooperation with humans, these robots needto be able to acquire new skills. Sensor capabilities for navigation in real humanenvironments and intelligent interaction with humans are some of the keychallenges.Learning by demonstration systems focus on the problem of human robotinteraction, and let the human teach the robot by demonstrating the task usinghis own hands. In this thesis, we present a solution to a subproblem within thelearning by demonstration field, namely human-robot grasp mapping. Robotgrasping of objects in a home or office environment is challenging problem.Programming by demonstration systems, can give important skills for aidingthe robot in the grasping task.The thesis presents two techniques for human-robot grasp mapping, directrobot imitation from human demonstrator and intelligent grasp imitation. Inintelligent grasp mapping, the robot takes the size and shape of the object intoconsideration, while for direct mapping, only the pose of the human hand isavailable.These are evaluated in a simulated environment on several robot platforms.The results show that knowing the object shape and size for a grasping taskimproves the robot precision and performance
Resumo:
The paper deals with a problem of intelligent system’s design for complex environments. There is discussed a possibility to integrate several technologies into one basic structure that could form a kernel of an autonomous intelligent robotic system. One alternative structure is proposed in order to form a basis of an intelligent system that would be able to operate in complex environments. The proposed structure is very flexible because of features that allow adapting via learning and adjustment of the used knowledge. Therefore, the proposed structure may be used in environments with stochastic features such as hardly predictable events or elements. The basic elements of the proposed structure have found their implementation in software system and experimental robotic system. The software system as well as the robotic system has been used for experimentation in order to validate the proposed structure - its functionality, flexibility and reliability. Both of them are presented in the paper. The basic features of each system are presented as well. The most important results of experiments are outlined and discussed at the end of the paper. Some possible directions of further research are also sketched at the end of the paper.
Resumo:
Jerne's idiotypic network theory postulates that the immune response involves inter-antibody stimulation and suppression as well as matching to antigens. The theory has proved the most popular Artificial Immune System (AIS) model for incorporation into behavior-based robotics but guidelines for implementing idiotypic selection are scarce. Furthermore, the direct effects of employing the technique have not been demonstrated in the form of a comparison with non-idiotypic systems. This paper aims to address these issues. A method for integrating an idiotypic AIS network with a Reinforcement Learning based control system (RL) is described and the mechanisms underlying antibody stimulation and suppression are explained in detail. Some hypotheses that account for the network advantage are put forward and tested using three systems with increasing idiotypic complexity. The basic RL, a simplified hybrid AIS-RL that implements idiotypic selection independently of derived concentration levels and a full hybrid AIS-RL scheme are examined. The test bed takes the form of a simulated Pioneer robot that is required to navigate through maze worlds detecting and tracking door markers.
Resumo:
Jerne's idiotypic network theory postulates that the immune response involves inter-antibody stimulation and suppression as well as matching to antigens. The theory has proved the most popular Artificial Immune System (AIS) model for incorporation into behavior-based robotics but guidelines for implementing idiotypic selection are scarce. Furthermore, the direct effects of employing the technique have not been demonstrated in the form of a comparison with non-idiotypic systems. This paper aims to address these issues. A method for integrating an idiotypic AIS network with a Reinforcement Learning based control system (RL) is described and the mechanisms underlying antibody stimulation and suppression are explained in detail. Some hypotheses that account for the network advantage are put forward and tested using three systems with increasing idiotypic complexity. The basic RL, a simplified hybrid AIS-RL that implements idiotypic selection independently of derived concentration levels and a full hybrid AIS-RL scheme are examined. The test bed takes the form of a simulated Pioneer robot that is required to navigate through maze worlds detecting and tracking door markers.
Resumo:
The design of an efficient collaborative multirobot framework that ensures the autonomy and the individualrequirements of the involved robots is a very challenging task. This requires designing an efficient platform for inter-robot communication. P2P is a good approach to achieve this goal. P2P aims at making the communication ubiquitous thereby crossing the communication boundary and has many attractive features to use it as a platform for collaborative multi-robot environments. In this work, we present the JXTA Overlay P2P system and its application for robot control. Since JXTAOverlay is able to overcome Firewalls, Routers and NATs, it is possible to control end-devices in a WAN without changing the network security policy. We used JXTA-Overlay for the control of robot motors. We evaluated the proposed system by many experiments and have shown that the proposed system has a good performance and can be used successfully for the control of robot.
Resumo:
Learning from demonstration becomes increasingly popular as an efficient way of robot programming. Not only a scientific interest acts as an inspiration in this case but also the possibility of producing the machines that would find application in different areas of life: robots helping with daily routine at home, high performance automata in industries or friendly toys for children. One way to teach a robot to fulfill complex tasks is to start with simple training exercises, combining them to form more difficult behavior. The objective of the Master’s thesis work was to study robot programming with visual input. Dynamic movement primitives (DMPs) were chosen as a tool for motion learning and generation. Assuming a movement to be a spring system influenced by an external force, making this system move, DMPs represent the motion as a set of non-linear differential equations. During the experiments the properties of DMP, such as temporal and spacial invariance, were examined. The effect of the DMP parameters, including spring coefficient, damping factor, temporal scaling, on the trajectory generated were studied.
Resumo:
Traditionally simulators have been used extensively in robotics to develop robotic systems without the need to build expensive hardware. However, simulators can be also be used as a “memory”for a robot. This allows the robot to try out actions in simulation before executing them for real. The key obstacle to this approach is an uncertainty of knowledge about the environment. The goal of the Master’s Thesis work was to develop a method, which allows updating the simulation model based on actual measurements to achieve a success of the planned task. OpenRAVE was chosen as an experimental simulation environment on planning,trial and update stages. Steepest Descent algorithm in conjunction with Golden Section search procedure form the principle part of optimization process. During experiments, the properties of the proposed method, such as sensitivity to different parameters, including gradient and error function, were examined. The limitations of the approach were established, based on analyzing the regions of convergence.
Resumo:
There has been recent interest in using temporal difference learning methods to attack problems of prediction and control. While these algorithms have been brought to bear on many problems, they remain poorly understood. It is the purpose of this thesis to further explore these algorithms, presenting a framework for viewing them and raising a number of practical issues and exploring those issues in the context of several case studies. This includes applying the TD(lambda) algorithm to: 1) learning to play tic-tac-toe from the outcome of self-play and of play against a perfectly-playing opponent and 2) learning simple one-dimensional segmentation tasks.
Resumo:
A parallel processor architecture based on a communicating sequential processor chip, the transputer, is described. The architecture is easily linearly extensible to enable separate functions to be included in the controller. To demonstrate the power of the resulting controller some experimental results are presented comparing PID and full inverse dynamics on the first three joints of a Puma 560 robot. Also examined are some of the sample rate issues raised by the asynchronous updating of inertial parameters, and the need for full inverse dynamics at every sample interval is questioned.
Resumo:
Awareness of emerging situations in a dynamic operational environment of a robotic assistive device is an essential capability of such a cognitive system, based on its effective and efficient assessment of the prevailing situation. This allows the system to interact with the environment in a sensible (semi)autonomous / pro-active manner without the need for frequent interventions from a supervisor. In this paper, we report a novel generic Situation Assessment Architecture for robotic systems directly assisting humans as developed in the CORBYS project. This paper presents the overall architecture for situation assessment and its application in proof-of-concept Demonstrators as developed and validated within the CORBYS project. These include a robotic human follower and a mobile gait rehabilitation robotic system. We present an overview of the structure and functionality of the Situation Assessment Architecture for robotic systems with results and observations as collected from initial validation on the two CORBYS Demonstrators.