Biblioteca Digital

34 resultados para reinforcement

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain

Hybrid coordination of reinforcement learning-based behaviors for AUV control

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors

A behavior-based scheme using reinforcement learning for autonomous underwater vehicles

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a hybrid behavior-based scheme using reinforcement learning for high-level control of autonomous underwater vehicles (AUVs). Two main features of the presented approach are hybrid behavior coordination and semi on-line neural-Q_learning (SONQL). Hybrid behavior coordination takes advantages of robustness and modularity in the competitive approach as well as efficient trajectories in the cooperative approach. SONQL, a new continuous approach of the Q_learning algorithm with a multilayer neural network is used to learn behavior state/action mapping online. Experimental results show the feasibility of the presented approach for AUVs

Policy gradient based Reinforcement Learning for real autonomous underwater cable tracking

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV

Autonomous underwater vehicle control using reinforcement learning policy search methods

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Autonomous underwater vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of subsea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of reinforcement learning direct policy search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task

Towards Direct Policy Search Reinforcement Learning for Robot Control

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task

Management of corn stalk waste as reinforcement for polypropylene injection moulded composites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this study was the management of corn stalk waste as reinforcement for polypropylene (PP) injection moulded composites as an alternative to wood flour and fibers. In the first step, corn stalk waste was subjected to various treatments, and four different corn stalk derivatives (flour and fibers) able to be used as reinforcement of composite materials were prepared and characterized. These derivatives are corn stalk flour, thermo-mechanical, semi-chemical, and chemical fibers. They were characterized in terms of their yield, lignin content, Kappa number, fiber length/diameter ratio, fines, coarseness, viscosity, and the length at the break of a standard sheet of paper. Results showed that the corn stalk derivatives have different physico-chemical properties. In the second step, the prepared flour and fibers were explored as a reinforcing element for PP composites. Coupled and non-coupled PP composites were prepared and tested for tensile properties. For overall trend, with the addition of a coupling agent, tensile properties of composites significantly improved, as compared with non-coupled samples. In addition, a morphological study revealed the positive effect of the coupling agent on the interfacial bonding. The composites prepared with semichemical fiber gave better results in comparison with the rest of the corn stalk derivatives due to its chemical characteristics

CB1 cannabinoid receptor modulates MDMA acute responses and reinforcement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: 3, 4-methylenedioxymethamphetamine (MDMA) is a popular recreational drug widely abused by young people. The endocannabinoid system is involved in the addictive processes induced by different drugs of abuse. However, the role of this system in the pharmacological effects of MDMA has not been yet clarified.Methods: Locomotion, body temperature and anxiogenic-like responses were evaluated after acute MDMA administration in CB1 knockout mice. Additionally, MDMA rewarding properties were investigated in the place conditioning and the intravenous self-administration paradigms. Extracellular levels of DA in the nucleus accumbens were also analyzed after a single administration of MDMA by in vivo microdialysis. Results: Acute MDMA administration increased locomotor activity, body temperature and anxiogenic-like responses in wild type mice, but these responses were lower or abolished in knockout animals. MDMA produced similar conditioned place preference and increased dopamine extracellular levels in the nucleus accumbens in both genotypes. Nevertheless, CB1 knockout mice failed to self-administer MDMA at any of the doses used. Conclusions: These results indicate that CB1 cannabinoid receptors play an important role in the acute prototypical effects of MDMA, and are essential in the acquisition of an operant behavior to self-administer this drug.

Reinforcement learning utilizes proxemics: an avatar learns to manipulate the position of people in immersive virtual reality

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Reinforcement learning utilizes proxemics: an avatar learns to manipulate the position of people in immersive virtual reality

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Reinforcement versus Fluidization in Cytoskeletal Mechanoresponsiveness

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Every adherent eukaryotic cell exerts appreciable traction forces upon its substrate. Moreover, every resident cell within the heart, great vessels, bladder, gut or lung routinely experiences large periodic stretches. As an acute response to such stretches the cytoskeleton can stiffen, increase traction forces and reinforce, as reported by some, or can soften and fluidize, as reported more recently by our laboratory, but in any given circumstance it remains unknown which response might prevail or why. Using a novel nanotechnology, we show here that in loading conditions expected in most physiological circumstances the localized reinforcement response fails to scale up to the level of homogeneous cell stretch; fluidization trumps reinforcement. Whereas the reinforcement response is known to be mediated by upstream mechanosensing and downstream signaling, results presented here show the fluidization response to be altogether novel: it is a direct physical effect of mechanical force acting upon a structural lattice that is soft and fragile. Cytoskeletal softness and fragility, we argue, is consistent with early evolutionary adaptations of the eukaryotic cell to material properties of a soft inert microenvironment.

Research on the suitability of organosolv semi-chemical triticale fibers as reinforcement for recycled hdpe composites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this research was to study the feasibility of incorporating organosolv semi-chemical triticale fibers as the reinforcing element in recycled high density polyethylene (HDPE). In the first step, triticale fibers were characterized in terms of chemical composition and compared with other biomass species (wheat, rye, softwood, and hardwood). Then, organosolv semi-chemical triticale fibers were prepared by the ethanolamine process. These fibers were characterized in terms of its yield, kappa number, fiber length/diameter ratio, fines, and viscosity; the obtained results were compared with those of eucalypt kraft pulp. In the second step, the prepared fibers were examined as a reinforcing element for recycled HDPE composites. Coupled and non-coupled HDPE composites were prepared and tested for tensile properties. Results showed that with the addition of the coupling agent maleated polyethylene (MAPE), the tensile properties of composites were significantly improved, as compared to non-coupled samples and the plain matrix. Furthermore, the influence of MAPE on the interfacial shear strength (IFSS) was studied. The contributions of both fibers and matrix to the composite strength were also studied. This was possible by the use of a numerical iterative method based on the Bowyer-Bader and Kelly-Tyson equations

Atención a la diversidad

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Proyecto realizado en el Colegio Raimundo Lulio de Madrid en el cual el objetivo principal ha sido estudiar la atención a la diversidad desde los distintos puntos en que el centro la trata. Se han analizado los programas que están en marcha para atender a las necesidades especiales educativas.

Factores que influyen en los comportamientos sexuales de prevención frente al virus de inmunideficiencia humana (VIH) en los adictos a las drogas por vía parenteral (ADVP)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Los datos existentes sobre el progresivo incremento de la infección con el virus de inmunodeficiencia humana (VIH) entre los adictos a las drogas por via parenteral (ADVP) y sus parejas e hijos, plantean la necesidad urgente de elaborar programas preventivos con el mayor grado de eficacia posible. En el presente trabajo nos proponemos tres objetivos: 1) Poner de manifiesto algunas insuficiencias observadas en los modelos deprevención que se aplican al caso del SIDA. 2) Conferir un énfasis especial a la influencia sobre los comportamientos preventivos frente al SIDA, de ciertos factores que, en general, no se tienen 10 bastante en cuenta en los modelos actuales como son: la magnitud del reforzamiento contingente a un determinado comportamiento y la demora con la que éste se recibe. 3) Exponer los resultados de una investigación realizada con drogadictos por via parenteral (Planes, 1991), cuyos objetivos eran conocer las relacionesexistentes entre la magnitud y la demora del reforzamiento contingente a los comportamientos sexuales preventivos y la frecuencia de dichos comportamientos

Semi-online neural-Q_leaming for real-time robot learning

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior

Electronic voting system for computer supported collaborative learning

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The EVS4CSCL project starts in the context of a Computer Supported Collaborative Learning environment (CSCL). Previous UOC projects created a CSCL generic platform (CLPL) to facilitate the development of CSCL applications. A discussion forum (DF) was the first application developed over the framework. This discussion forum was different from other products on the marketplace because of its focus on the learning process. The DF carried out the specification and elaboration phases from the discussion learning process but there was a lack in the consensus phase. The consensus phase in a learning environment is not something to be achieved but tested. Common tests are done by Electronic Voting System (EVS) tools, but consensus test is not an assessment test. We are not evaluating our students by their answers but by their discussion activity. Our educational EVS would be used as a discussion catalyst proposing a discussion about the results after an initial query or it would be used after a discussion period in order to manifest how the discussion changed the students mind (consensus). It should be also used by the teacher as a quick way to know where the student needs some reinforcement. That is important in a distance-learning environment where there is no direct contact between the teacher and the student and it is difficult to detect the learning lacks. In an educational environment, assessment it is a must and the EVS will provide direct assessment by peer usefulness evaluation, teacher marks on every query created and indirect assessment from statistics regarding the user activity.

«
1
2
3
»