42 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
JXME is the JXTA protocols implementation formobile devices using J2ME. Two different flavors of JXME have been implemented, each one specific for a particular set of devices, according to their capabilities. The main value of JXME is its simplicity to create peer-to-peer (P2P) applications in limited devices. In addition to assessing JXME functionalities, it is also important to realize the default security level provided. This paper presents a brief analysis of the current state of security in JXME, focusing on the JXME-Proxied version, identifies existing vulnerabilities and proposes further improvements in this field.
Resumo:
This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors
Resumo:
This paper presents a hybrid behavior-based scheme using reinforcement learning for high-level control of autonomous underwater vehicles (AUVs). Two main features of the presented approach are hybrid behavior coordination and semi on-line neural-Q_learning (SONQL). Hybrid behavior coordination takes advantages of robustness and modularity in the competitive approach as well as efficient trajectories in the cooperative approach. SONQL, a new continuous approach of the Q_learning algorithm with a multilayer neural network is used to learn behavior state/action mapping online. Experimental results show the feasibility of the presented approach for AUVs
Resumo:
This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV
Resumo:
A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.
Resumo:
A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.
Resumo:
Autonomous underwater vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of subsea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of reinforcement learning direct policy search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
This paper reports the microstructural analysis of S-rich CuIn(S,Se)2 layers produced by electrodeposition of CuInSe2 precursors and annealing under sulfurizing conditions as a function of the temperature of sulfurization. The characterization of the layers by Raman scattering, scanning electron microscopy, Auger electron spectroscopy, and XRD techniques has allowed observation of the strong dependence of the crystalline quality of these layers on the sulfurization temperature: Higher sulfurization temperatures lead to films with improved crystallinity, larger average grain size, and lower density of structural defects. However, it also favors the formation of a thicker MoS2 interphase layer between the CuInS2 absorber layer and the Mo back contact. Decreasing the temperature of sulfurization leads to a significant decrease in the thickness of this intermediate layer and is also accompanied by significant changes in the composition of the interface region between the absorber and the MoS2 layer, which becomes Cu rich. The characterization of devices fabricated with these absorbers corroborates the significant impact of all these features on device parameters as the open circuit voltage and fill factor that determine the efficiency of the solar cells.
Resumo:
As a result of climate change, streams are warming and their runoff has been decreasing in most temperate areas. These changes can affect consumers directly by increasing their metabolic rates and modifying their physiology and indirectly by changing the quality of the resources on which organisms depend. In this study, a common stream detritivore (Echinogammarus berilloni Catta) was reared at two temperatures (15 and 20°C) and fed Populus nigra L. leaves that had been conditioned either in an intermittent or permanent reach to evaluate the effects of resource quality and increased temperatures on detritivore performance, stoichiometry and nutrient cycling. The lower quality (i.e., lower protein, soluble carbohydrates and higher C:P and N:P ratios) of leaves conditioned in pools resulted in compensatory feeding and lower nutrient retention capacity by E. berilloni. This effect was especially marked for phosphorus, which was unexpected based on predictions of ecological stoichiometry. When individuals were fed pool-conditioned leaves at warmer temperatures, their growth rates were higher, but consumers exhibited less efficient assimilation and higher mortality. Furthermore, the shifts to lower C:P ratios and higher lipid concentrations in shredder body tissues suggest that structural molecules such as phospholipids are preserved over other energetic C-rich macromolecules such as carbohydrates. These effects on consumer physiology and metabolism were further translated into feces and excreta nutrient ratios. Overall, our results show that the effects of reduced leaf quality on detritivore nutrient retention were more severe at higher temperatures because the shredders were not able to offset their increased metabolism with increased consumption or more efficient digestion when fed pool-conditioned leaves. Consequently, the synergistic effects of impaired food quality and increased temperatures might not only affect the physiology and survival of detritivores but also extend to other trophic compartments through detritivore-mediated nutrient cycling.
Resumo:
La investigación aborda el problema de la violencia de género dentro de la pareja en mujeres inmigrantes. El objetivo es conocer el impacto de los dispositivos sociolegales sobre las propias mujeres inmigrantes, a la vez que se conoce como el fenómeno de la inmigración ha influido en las prácticas institucionales. Los resultados indican efectos de protección y a la vez efectos perversos en los dispositivos sociolegales que se cuidan de las mujeres inmigradas. Se hacen propuestas para una intervención más situada y reflexiva.
Resumo:
La memòria consisteix en el desenvolupament d'un sistema web per gestionar l'inventari de dispositius de xarxa i d'equips informàtics d'una escola. Més concretament dels equips es pot gestionar informació referent a característiques de hardware, software i de xarxa com l'adreça MAC, IP, roseta on està connectat... Dels dispositius de xarxa es pot gestionar informació referent als armaris, patch panels, switches, virtual LANs i punts d'accés on es connecten ordinadors, portàtils, telèfons o faxos.
Resumo:
Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior
Resumo:
Guifi.net es una red libre de telecomunicaciones construida a iniciativa de los propios participantes que, mediante un acuerdo entre iguales, se interconectan para compartir servicios y recursos. Del vínculo filosófico con el software libre deriva que toda la información sea pública, mientras que del agnosticismo tecnológico, la utilización de cualquier dispositivo del mercado. Por esto existe el 'unsolclic', una secuencia de órdenes de confi guración genérica de dispositivos y verdadero factor de éxito. El objetivo del proyecto es mejorar el proceso de incorporación de nuevos dispositivos a la aplicación actual. Mediante una nueva gestión web y un sistema de plantillas estándar, los usuarios avanzados podran crear los confi guradores 'unsolclic' para los nuevos dispositivos del mercado y mantener los existentes con más facilidad y eficiencia.
Resumo:
Utilizing the well-known Ultimatum Game, this note presents the following phenomenon. If we start with simple stimulus-response agents, learning through naive reinforcement, and then grant them some introspective capabilities, we get outcomes that are not closer but farther away from the fully introspective game-theoretic approach. The cause of this is the following: there is an asymmetry in the information that agents can deduce from their experience, and this leads to a bias in their learning process.