905 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new paradigm for collective learning in multi-agent systems (MAS) as a solution to the problem in which several agents acting over the same environment must learn how to perform tasks, simultaneously, based on feedbacks given by each one of the other agents. We introduce the proposed paradigm in the form of a reinforcement learning algorithm, nominating it as reinforcement learning with influence values. While learning by rewards, each agent evaluates the relation between the current state and/or action executed at this state (actual believe) together with the reward obtained after all agents that are interacting perform their actions. The reward is a result of the interference of others. The agent considers the opinions of all its colleagues in order to attempt to change the values of its states and/or actions. The idea is that the system, as a whole, must reach an equilibrium, where all agents get satisfied with the obtained results. This means that the values of the state/actions pairs match the reward obtained by each agent. This dynamical way of setting the values for states and/or actions makes this new reinforcement learning paradigm the first to include, naturally, the fact that the presence of other agents in the environment turns it a dynamical model. As a direct result, we implicitly include the internal state, the actions and the rewards obtained by all the other agents in the internal state of each agent. This makes our proposal the first complete solution to the conceptual problem that rises when applying reinforcement learning in multi-agent systems, which is caused by the difference existent between the environment and agent models. With basis on the proposed model, we create the IVQ-learning algorithm that is exhaustive tested in repetitive games with two, three and four agents and in stochastic games that need cooperation and in games that need collaboration. This algorithm shows to be a good option for obtaining solutions that guarantee convergence to the Nash optimum equilibrium in cooperative problems. Experiments performed clear shows that the proposed paradigm is theoretical and experimentally superior to the traditional approaches. Yet, with the creation of this new paradigm the set of reinforcement learning applications in MAS grows up. That is, besides the possibility of applying the algorithm in traditional learning problems in MAS, as for example coordination of tasks in multi-robot systems, it is possible to apply reinforcement learning in problems that are essentially collaborative

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work describes the study, the analysis, the project methodology and the constructive details of a high frequency DC/AC resonant series converter using sequential commutation techniques for the excitation of an inductive coupled thermal plasma torch. The aim of this thesis is to show the new modulation technique potentialities and to present a technological option for the high-frequency electronic power converters development. The resonant converter operates at 50 kW output power under a 400 kHz frequency and it is constituted by inverter cells using ultra-fast IGBT devices. In order to minimize the turn-off losses, the inverter cells operates in a ZVS mode referred by a modified PLL loop that maintains this condition stable, despite the load variations. The sequential pulse gating command strategy used it allows to operate the IGBT devices on its maximum power limits using the derating and destressing current scheme, as well as it propitiates a frequency multiplication of the inverters set. The output converter is connected to a series resonant circuit constituted by the applicator ICTP torch, a compensation capacitor and an impedance matching RF transformer. At the final, are presented the experimental results and the many tests achieved in laboratory as form to validate the proposed new technique

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The metaheuristics techiniques are known to solve optimization problems classified as NP-complete and are successful in obtaining good quality solutions. They use non-deterministic approaches to generate solutions that are close to the optimal, without the guarantee of finding the global optimum. Motivated by the difficulties in the resolution of these problems, this work proposes the development of parallel hybrid methods using the reinforcement learning, the metaheuristics GRASP and Genetic Algorithms. With the use of these techniques, we aim to contribute to improved efficiency in obtaining efficient solutions. In this case, instead of using the Q-learning algorithm by reinforcement learning, just as a technique for generating the initial solutions of metaheuristics, we use it in a cooperative and competitive approach with the Genetic Algorithm and GRASP, in an parallel implementation. In this context, was possible to verify that the implementations in this study showed satisfactory results, in both strategies, that is, in cooperation and competition between them and the cooperation and competition between groups. In some instances were found the global optimum, in others theses implementations reach close to it. In this sense was an analyze of the performance for this proposed approach was done and it shows a good performance on the requeriments that prove the efficiency and speedup (gain in speed with the parallel processing) of the implementations performed

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The semiconductor technologies evolutions leads devices to be developed with higher processing capability. Thus, those components have been used widely in more fields. Many industrial environment such as: oils, mines, automotives and hospitals are frequently using those devices on theirs process. Those industries activities are direct related to environment and health safe. So, it is quite important that those systems have extra safe features yield more reliability, safe and availability. The reference model eOSI that will be presented by this work is aimed to allow the development of systems under a new view perspective which can improve and make simpler the choice of strategies for fault tolerant. As a way to validate the model na architecture FPGA-based was developed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This Dissertation aims to provide a communication mechanism between Digital TV viewers and interaction devices, such as robots, for example, placed on the environment from which a TV program is being live broadcasted. Such communication mechanism has the objective to allow viewers controll the Interaction Devices through their TV devices, using the broadcast channel present in Interactive Digital TV systems, and receive data from the devices by the broadcast channel. This system was projected as a middlewaer system where the Interaction Devices in the TV program set are interconnected, creating a Interactive Device Network. With this approach, the system is capable of manage the devices on the network, controlling the flow of coming and leaving elements, in a transparent way for the viewers. The system yet allows the Interaction Devices communicate each other, with a integrated communication channel with no worries about the physical communication layer

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relevant researches have been growing on electric machine without mancal or bearing and that is generally named bearingless motor or specifically, mancal motor. In this paper it is made an introductory presentation about bearingless motor and its peripherical devices with focus on the design and implementation of sensors and interfaces needed to control rotor radial positioning and rotation of the machine. The signals from the machine are conditioned in analogic inputs of DSP TMS320F2812 and used in the control program. This work has a purpose to elaborate and build a system with sensors and interfaces suitable to the input and output of DSP TMS320F2812 to control a mancal motor, bearing in mind the modularity, simplicity of circuits, low number of power used, good noise imunity and good response frequency over 10 kHz. The system is tested at a modified ordinary induction motor of 3,7 kVA to be used with a bearingless motor with divided coil

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In multi-robot systems, both control architecture and work strategy represent a challenge for researchers. It is important to have a robust architecture that can be easily adapted to requirement changes. It is also important that work strategy allows robots to complete tasks efficiently, considering that robots interact directly in environments with humans. In this context, this work explores two approaches for robot soccer team coordination for cooperative tasks development. Both approaches are based on a combination of imitation learning and reinforcement learning. Thus, in the first approach was developed a control architecture, a fuzzy inference engine for recognizing situations in robot soccer games, a software for narration of robot soccer games based on the inference engine and the implementation of learning by imitation from observation and analysis of others robotic teams. Moreover, state abstraction was efficiently implemented in reinforcement learning applied to the robot soccer standard problem. Finally, reinforcement learning was implemented in a form where actions are explored only in some states (for example, states where an specialist robot system used them) differently to the traditional form, where actions have to be tested in all states. In the second approach reinforcement learning was implemented with function approximation, for which an algorithm called RBF-Sarsa($lambda$) was created. In both approaches batch reinforcement learning algorithms were implemented and imitation learning was used as a seed for reinforcement learning. Moreover, learning from robotic teams controlled by humans was explored. The proposal in this work had revealed efficient in the robot soccer standard problem and, when implemented in other robotics systems, they will allow that these robotics systems can efficiently and effectively develop assigned tasks. These approaches will give high adaptation capabilities to requirements and environment changes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reinforcement learning is a machine learning technique that, although finding a large number of applications, maybe is yet to reach its full potential. One of the inadequately tested possibilities is the use of reinforcement learning in combination with other methods for the solution of pattern classification problems. It is well documented in the literature the problems that support vector machine ensembles face in terms of generalization capacity. Algorithms such as Adaboost do not deal appropriately with the imbalances that arise in those situations. Several alternatives have been proposed, with varying degrees of success. This dissertation presents a new approach to building committees of support vector machines. The presented algorithm combines Adaboost algorithm with a layer of reinforcement learning to adjust committee parameters in order to avoid that imbalances on the committee components affect the generalization performance of the final hypothesis. Comparisons were made with ensembles using and not using the reinforcement learning layer, testing benchmark data sets widely known in area of pattern classification

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As raias da família Potamotrygonidae representam um grupo singular da ictiofauna Neotropical. Apesar de serem antigos os relatos sobre o grupo, ainda são muitas as questões que permanecem sem resposta, sobretudo no que diz respeito à biologia das espécies que ocorrem na Bacia do Paraná-Paraguai. No presente trabalho foi analisada a dieta de Potamotrygon falkneri e Potamotrygon motoro, capturadas no Alto Rio Paraná, a jusante da Usina Hidrelétrica Engenheiro Souza Dias (UHE Jupiá). As duas espécies de raias apresentaram dieta diversificada, ingerindo 14 itens, entre moluscos, crustáceos, insetos e peixes, porém com predominância de insetos aquáticos em diversidade e abundância. Somente um indivíduo de cada espécie ingeriu peixe. Potamotrygon motoro consumiu principalmente Ephemeroptera, enquanto P. falkneri, principalmente Mollusca, Hemiptera e Trichoptera. Os dados aparentemente indicam uma dieta mais especializada de P. motoro, com maior consumo de Ephemeroptera (Baetidae), e uma dieta mais generalizada de P. falkneri. A análise dos indivíduos capturados em três micro-hábitats, que diferem quanto ao tipo de substrato e presença de vegetação marginal, sugere diferenças nos tipos de alimentos consumidos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A linearly tunable low-voltage CMOS transconductor featuring a new adaptative-bias mechanism that considerably improves the stability of the processed-signal common,mode voltage over the tuning range, critical for very-low voltage applications, is introduced. It embeds a feedback loop that holds input devices on triode region while boosting the output resistance. Analysis of the integrator frequency response gives an insight into the location of secondary poles and zeros as function of design parameters. A third-order low-pass Cauer filter employing the proposed transconductor was designed and integrated on a 0.8-mum n-well CMOS standard process. For a 1.8-V supply, filter characterization revealed f(p) = 0.93 MHz, f(s) = 1.82 MHz, A(min) = 44.08, dB, and A(max) = 0.64 dB at nominal tuning. Mined by a de voltage V-TUNE, the filter bandwidth was linearly adjusted at a rate of 11.48 kHz/mV over nearly one frequency decade. A maximum 13-mV deviation on the common-mode voltage at the filter output was measured over the interval 25 mV less than or equal to V-TUNE less than or equal to 200 mV. For V-out = 300 mV(pp) and V-TUNE = 100 mV, THD was -55.4 dB. Noise spectral density was 0.84 muV/Hz(1/2) @1 kHz and S/N = 41 dB @ V-out = 300 mV(pp) and 1-MHz bandwidth. Idle power consumption was 1.73 mW @V-TUNE = 100 mV. A tradeoff between dynamic range, bandwidth, power consumption, and chip area has then been achieved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)