59 resultados para tunnel reinforcement
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors
Resumo:
This paper presents a hybrid behavior-based scheme using reinforcement learning for high-level control of autonomous underwater vehicles (AUVs). Two main features of the presented approach are hybrid behavior coordination and semi on-line neural-Q_learning (SONQL). Hybrid behavior coordination takes advantages of robustness and modularity in the competitive approach as well as efficient trajectories in the cooperative approach. SONQL, a new continuous approach of the Q_learning algorithm with a multilayer neural network is used to learn behavior state/action mapping online. Experimental results show the feasibility of the presented approach for AUVs
Resumo:
This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV
Resumo:
Autonomous underwater vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of subsea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of reinforcement learning direct policy search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
The main objective of this study was the management of corn stalk waste as reinforcement for polypropylene (PP) injection moulded composites as an alternative to wood flour and fibers. In the first step, corn stalk waste was subjected to various treatments, and four different corn stalk derivatives (flour and fibers) able to be used as reinforcement of composite materials were prepared and characterized. These derivatives are corn stalk flour, thermo-mechanical, semi-chemical, and chemical fibers. They were characterized in terms of their yield, lignin content, Kappa number, fiber length/diameter ratio, fines, coarseness, viscosity, and the length at the break of a standard sheet of paper. Results showed that the corn stalk derivatives have different physico-chemical properties. In the second step, the prepared flour and fibers were explored as a reinforcing element for PP composites. Coupled and non-coupled PP composites were prepared and tested for tensile properties. For overall trend, with the addition of a coupling agent, tensile properties of composites significantly improved, as compared with non-coupled samples. In addition, a morphological study revealed the positive effect of the coupling agent on the interfacial bonding. The composites prepared with semichemical fiber gave better results in comparison with the rest of the corn stalk derivatives due to its chemical characteristics
Resumo:
Background: 3, 4-methylenedioxymethamphetamine (MDMA) is a popular recreational drug widely abused by young people. The endocannabinoid system is involved in the addictive processes induced by different drugs of abuse. However, the role of this system in the pharmacological effects of MDMA has not been yet clarified.Methods: Locomotion, body temperature and anxiogenic-like responses were evaluated after acute MDMA administration in CB1 knockout mice. Additionally, MDMA rewarding properties were investigated in the place conditioning and the intravenous self-administration paradigms. Extracellular levels of DA in the nucleus accumbens were also analyzed after a single administration of MDMA by in vivo microdialysis. Results: Acute MDMA administration increased locomotor activity, body temperature and anxiogenic-like responses in wild type mice, but these responses were lower or abolished in knockout animals. MDMA produced similar conditioned place preference and increased dopamine extracellular levels in the nucleus accumbens in both genotypes. Nevertheless, CB1 knockout mice failed to self-administer MDMA at any of the doses used. Conclusions: These results indicate that CB1 cannabinoid receptors play an important role in the acute prototypical effects of MDMA, and are essential in the acquisition of an operant behavior to self-administer this drug.
Resumo:
La2/3Ca1/3MnO3 (LCMO) films have been deposited on (110)-oriented SrTiO3 (STO) substrates. X-ray diffraction and high-resolution electron microscopy reveal that the (110) LCMO films are epitaxial and anisotropically in-plane strained, with higher relaxation along the [1¿10] direction than along the [001] direction; x-ray absorption spectroscopy data signaled the existence of a single intermediate Mn3+/4+ 3d-state at the film surface. Their magnetic properties are compared to those of (001) LCMO films grown simultaneously on (001) STO substrates It is found that (110) LCMO films present a higher Curie temperature (TC) and a weaker decay of magnetization when approaching TC than their (001) LCMO counterparts. These improved films have been subsequently covered by nanometric STO layers. Conducting atomic-force experiments have shown that STO layers, as thin as 0.8 nm, grown on top of the (110) LCMO electrode, display good insulating properties. We will show that the electric conductance across (110) STO layers, exponentially depending on the barrier thickness, is tunnel-like. The barrier height in STO (110) is found to be similar to that of STO (001). These results show that the (110) LCMO electrodes can be better electrodes than (001) LCMO for magnetic tunnel junctions, and that (110) STO are suitable insulating barriers.
Resumo:
We report on experiments of spin filtering through ultrathin single-crystal layers of the insulating and ferromagnetic oxide BiMnO3 (BMO). The spin polarization of the electrons tunneling from a gold electrode through BMO is analyzed with a counterelectrode of the half-metallic oxide La2/3Sr1/3MnO3 (LSMO). At 3 K we find a 50% change of the tunnel resistances according to whether the magnetizations of BMO and LSMO are parallel or opposite. This effect corresponds to a spin-filtering efficiency of up to 22%. Our results thus show the potential of complex ferromagnetic insulating oxides for spin filtering and injection.
Resumo:
We report on the growth and characterization of SrRuO3 single layers and SrRuO3/SrTiO3/SrRuO3 heterostructures grown on SrTiO3(100) substrates. The thickness dependence of the coercivity was determined for these single layers. Heterostructures with barrier thickness tb=1, 2.5, and 4 nm were fabricated, with electrodes having thickness ranging from 10 to 100 nm. The hysteresis loops of heterostructures with tb=2.5¿nm, 4 nm reveal uncoupled magnetic switching of the electrodes. Therefore, these heterostructures can be used for the fabrication of magnetic tunneling junctions.
Resumo:
A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.
Resumo:
A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.