892 resultados para reinforcement


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: This paper is a commentary to a debate article entitled: "Are we overpathologizing everyday life? A tenable blueprint for behavioral addiction research", by Billieux et al. (2015). Methods and aim: This brief response focused on the necessity to better characterize psychological and related neurocognitive determinants of persistent deleterious actions associated or not with substance utilization. Results: A majority of addicted people could be driven by psychological functional reasons to keep using drugs, gambling or buying despite the growing number of related negative consequences. In addition, a non-negligible proportion of them would need assistance to restore profound disturbances in basic learning processes involved in compulsive actions. Conclusions: The distinction between psychological functionality and compulsive aspects of addictive behaviors should represent a big step towards more efficient treatments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cigarette smoking remains the leading preventable cause of death and disability in the United States and most often is initiated during adolescence. An emerging body of research suggests that a negative reinforcement model may explain factors that contribute to tobacco use during adolescence and that negative reinforcement processes may contribute to tobacco use to a greater extent among female adolescents than among male adolescents. However, the extant literature both on the relationship between negative reinforcement processes and adolescent tobacco use as well as on the relationship between gender, negative reinforcement processes, and adolescent tobacco use is limited by the sole reliance on self-report measures of negative reinforcement processes that may contribute to cigarette smoking. The current study aimed to further disentangle the relationships between negative reinforcement based risk taking, gender and tobacco use during older adolescence by utilizing a behavioral analogue measure of negative reinforcement based risk taking, the Maryland Resource for the Behavioral Utilization of the Reinforcement of Negative Stimuli (MRBURNS). Specifically, we examined the relationship between pumps on the MRBURNS, an indicator of risk taking, and smoking status as well as the interaction between MRBURNS pumps and gender for predicting smoking status. Participants included 103 older adolescents (n=51 smokers, 50.5% female, Age (M(SD) = 19.41(1.06)) who all attended one experimental session during which they completed the MRBURNS as well as self-report measures of tobacco use, nicotine dependence, alcohol use, depression, and anxiety. We utilized binary logistic regressions to examine the relationship between MRBURNS pumps and smoking status as well as the interactive effect of MRBURNS pumps and gender for predicting smoking status. Controlling for relevant covariates, pumps on the MRBURNS did not significantly predict smoking status and the interaction between pumps on the MRBURNS and gender also did not significantly predict smoking status. These findings highlight the importance of future research examining various task modifications to the MRBURNS as well as the need for replications of this study with larger, more diverse samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional knowledge associated with genetic resources (TKaGRs) is acknowledged as a valuable resource. Its value draws from economic, social, cultural, and innovative uses. This value places TK at the heart of competing interests as between indigenous peoples who hold it and depend on it for their survival, and profitable industries which seek to exploit it in the global market space. The latter group seek, inter alia, to advance and maintain their global competitiveness by exploiting TKaGRs leads in their research and development activities connected with modern innovation. Biopiracy remains an issue of central concern to the developing world and has emerged in this context as a label for the inequity arising from the misappropriation of TKaGRs located in the South by commercial interests usually located in the North. Significant attention and resources are being channeled at global efforts to design and implement effective protection mechanisms for TKaGRs against the incidence of biopiracy. The emergence and recent entry into force of the Nagoya Protocol offers the latest example of a concluded multilateral effort in this regard. The Nagoya Protocol, adopted on the platform of the Convention on Biological Diversity (CBD), establishes an open-ended international access and benefit sharing (ABS) regime which is comprised of the Protocol as well as several complementary instruments. By focusing on the trans-regime nature of biopiracy, this thesis argues that the intellectual property (IP) system forms a central part of the problem of biopiracy, and so too to the very efforts to implement solutions, including through the Nagoya Protocol. The ongoing related work within the World Intellectual Property Organization (WIPO), aimed at developing an international instrument (or a series of instruments) to address the effective protection of TK, constitutes an essential complementary process to the Nagoya Protocol, and, as such, forms a fundamental element within the Nagoya Protocol’s evolving ABS regime-complex. By adopting a third world approach to international law, this thesis draws central significance from its reconceptualization of biopiracy as a trans-regime concept. By construing the instrument(s) being negotiated within WIPO as forming a central component part of the Nagoya Protocol, this dissertation’s analysis highlights the importance of third world efforts to secure an IP-based reinforcement to the Protocol for the effective eradication of biopiracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

That humans and animals learn from interaction with the environment is a foundational idea underlying nearly all theories of learning and intelligence. Learning that certain outcomes are associated with specific actions or stimuli (both internal and external), is at the very core of the capacity to adapt behaviour to environmental changes. In the present work, appetitive and aversive reinforcement learning paradigms have been used to investigate the fronto-striatal loops and behavioural correlates of adaptive and maladaptive reinforcement learning processes, aiming to a deeper understanding of how cortical and subcortical substrates interacts between them and with other brain systems to support learning. By combining a large variety of neuroscientific approaches, including behavioral and psychophysiological methods, EEG and neuroimaging techniques, these studies aim at clarifying and advancing the knowledge of the neural bases and computational mechanisms of reinforcement learning, both in normal and neurologically impaired population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo realizzato un applicativo chiamato Service Broker, il quale, attraverso il reinforcement learning, apprende come distribuire l'esecuzione di tasks su dei lavoratori asincroni e distribuiti. L'analogia è quella di un cloud data center, il quale possiede delle risorse interne - possibilmente distribuite nella server farm -, riceve dei tasks dai suoi clienti e li esegue su queste risorse. L'obiettivo dell'applicativo, e quindi del data center, è quello di allocare questi tasks in maniera da minimizzare il costo di esecuzione. Inoltre, al fine di testare gli agenti del reinforcement learning sviluppati è stato creato un environment, un simulatore, che permettesse di concentrarsi nello sviluppo dei componenti necessari agli agenti, invece che doversi anche occupare di eventuali aspetti implementativi necessari in un vero data center, come ad esempio la comunicazione con i vari nodi e i tempi di latenza di quest'ultima. I risultati ottenuti hanno dunque confermato la teoria studiata, riuscendo a ottenere prestazioni migliori di alcuni dei metodi classici per il task allocation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increased exploitation of carbon fiber reinforced polymers (CFRP) is inevitably bringing about an increase in production scraps and end-of-life components, resulting in a sharp increase in CFRP waste. Therefore, it is of paramount importance to find efficient ways to reintroduce waste into the manufacturing cycle. At present, several recycling methods for treating CFRPs are available, even if all of them still have to be optimized. The step after CFRP recycling, and also the key to build a solid and sustainable CFRP recycling market, is represented by the utilization of Re-CFs. The smartest way to utilize recovered carbon fibers is through the manufacturing of recycled CFRPs, that can be done by re-impregnating the recovered fibers with a new polymeric matrix. Fused Filament Fabrication (FFF) is one of the most widely used additive manufacturing (3D printing) techniques that fabricates parts with a polymeric filament deposition process that allows to produce parts adding material layer-by-layer, only where it is needed, saving energy, raw material cost, and waste. The filament can also contain fillers or reinforcements such as recycled short carbon fibers and this makes it perfectly compliant with the re-application of the shortened recycled CF. Therefore, in this thesis work recycled and virgin carbon fiber reinforced PLA filaments have been initially produced using 5% and 10% of CFs load. Properties and characteristics of the filaments have been determined conducting different analysis (TGA, DMA, DSC). Subsequently the 5%wt. Re-CFs filament has been used to 3D print specimens for mechanical characterization (DMA, tensile test and CTE), in order to evaluate properties of printed PLA composites containing Re-CFs and evaluate the feasibility of Re-CFs in 3D printing application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nella prima parte del mio lavoro viene presentato uno studio di una prima soluzione "from scratch" sviluppata da Andrew Karpathy. Seguono due miei miglioramenti: il primo modificando direttamente il codice della precedente soluzione e introducendo, come obbiettivo aggiuntivo per la rete nelle prime fasi di gioco, l'intercettazione della pallina da parte della racchetta, migliorando l'addestramento iniziale; il secondo é una mia personale implementazione utilizzando algoritmi più complessi, che sono allo stato dell'arte su giochi dell'Atari, e che portano un addestramento molto più veloce della rete.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinforcement learning in particular has never been applied to such cases before. Being able to successfully make use of reinforcement learning in an encrypted scenario would allow us to create an agent that efficiently controls a system without providing it with full knowledge of the environment it is operating in, leading the way to many possible use cases. Therefore, we have decided to apply the reinforcement learning paradigm to encrypted data. In this project we have applied one of the most well-known reinforcement learning algorithms, called Deep Q-Learning, to simple simulated environments and studied how the encryption affects the training performance of the agent, in order to see if it is still able to learn how to behave even when the input data is no longer readable by humans. The results of this work highlight that the agent is still able to learn with no issues whatsoever in small state spaces with non-secure encryptions, like AES in ECB mode. For fixed environments, it is also able to reach a suboptimal solution even in the presence of secure modes, like AES in CBC mode, showing a significant improvement with respect to a random agent; however, its ability to generalize in stochastic environments or big state spaces suffers greatly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reinforcement Learning is an increasingly popular area of Artificial Intelligence. The applications of this learning paradigm are many, but its application in mobile computing is in its infancy. This study aims to provide an overview of current Reinforcement Learning applications on mobile devices, as well as to introduce a new framework for iOS devices: Swift-RL Lib. This new Swift package allows developers to easily support and integrate two of the most common RL algorithms, Q-Learning and Deep Q-Network, in a fully customizable environment. All processes are performed on the device, without any need for remote computation. The framework was tested in different settings and evaluated through several use cases. Through an in-depth performance analysis, we show that the platform provides effective and efficient support for Reinforcement Learning for mobile applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reinforcement Learning (RL) provides a powerful framework to address sequential decision-making problems in which the transition dynamics is unknown or too complex to be represented. The RL approach is based on speculating what is the best decision to make given sample estimates obtained from previous interactions, a recipe that led to several breakthroughs in various domains, ranging from game playing to robotics. Despite their success, current RL methods hardly generalize from one task to another, and achieving the kind of generalization obtained through unsupervised pre-training in non-sequential problems seems unthinkable. Unsupervised RL has recently emerged as a way to improve generalization of RL methods. Just as its non-sequential counterpart, the unsupervised RL framework comprises two phases: An unsupervised pre-training phase, in which the agent interacts with the environment without external feedback, and a supervised fine-tuning phase, in which the agent aims to efficiently solve a task in the same environment by exploiting the knowledge acquired during pre-training. In this thesis, we study unsupervised RL via state entropy maximization, in which the agent makes use of the unsupervised interactions to pre-train a policy that maximizes the entropy of its induced state distribution. First, we provide a theoretical characterization of the learning problem by considering a convex RL formulation that subsumes state entropy maximization. Our analysis shows that maximizing the state entropy in finite trials is inherently harder than RL. Then, we study the state entropy maximization problem from an optimization perspective. Especially, we show that the primal formulation of the corresponding optimization problem can be (approximately) addressed through tractable linear programs. Finally, we provide the first practical methodologies for state entropy maximization in complex domains, both when the pre-training takes place in a single environment as well as multiple environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design process of any electric vehicle system has to be oriented towards the best energy efficiency, together with the constraint of maintaining comfort in the vehicle cabin. Main aim of this study is to research the best thermal management solution in terms of HVAC efficiency without compromising occupant’s comfort and internal air quality. An Arduino controlled Low Cost System of Sensors was developed and compared against reference instrumentation (average R-squared of 0.92) and then used to characterise the vehicle cabin in real parking and driving conditions trials. Data on the energy use of the HVAC was retrieved from the car On-Board Diagnostic port. Energy savings using recirculation can reach 30 %, but pollutants concentration in the cabin builds up in this operating mode. Moreover, the temperature profile appeared strongly nonuniform with air temperature differences up to 10° C. Optimisation methods often require a high number of runs to find the optimal configuration of the system. Fast models proved to be beneficial for these task, while CFD-1D model are usually slower despite the higher level of detail provided. In this work, the collected dataset was used to train a fast ML model of both cabin and HVAC using linear regression. Average scaled RMSE over all trials is 0.4 %, while computation time is 0.0077 ms for each second of simulated time on a laptop computer. Finally, a reinforcement learning environment was built in OpenAI and Stable-Baselines3 using the built-in Proximal Policy Optimisation algorithm to update the policy and seek for the best compromise between comfort, air quality and energy reward terms. The learning curves show an oscillating behaviour overall, with only 2 experiments behaving as expected even if too slow. This result leaves large room for improvement, ranging from the reward function engineering to the expansion of the ML model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Spiking Neural Networks (SNNs) are bio-inspired Artificial Neural Networks (ANNs) utilizing discrete spiking signals, akin to neuron communication in the brain, making them ideal for real-time and energy-efficient Cyber-Physical Systems (CPSs). This thesis explores their potential in Structural Health Monitoring (SHM), leveraging low-cost MEMS accelerometers for early damage detection in motorway bridges. The study focuses on Long Short-Term SNNs (LSNNs), although their complex learning processes pose challenges. Comparing LSNNs with other ANN models and training algorithms for SHM, findings indicate LSNNs' effectiveness in damage identification, comparable to ANNs trained using traditional methods. Additionally, an optimized embedded LSNN implementation demonstrates a 54% reduction in execution time, but with longer pre-processing due to spike-based encoding. Furthermore, SNNs are applied in UAV obstacle avoidance, trained directly using a Reinforcement Learning (RL) algorithm with event-based input from a Dynamic Vision Sensor (DVS). Performance evaluation against Convolutional Neural Networks (CNNs) highlights SNNs' superior energy efficiency, showing a 6x decrease in energy consumption. The study also investigates embedded SNN implementations' latency and throughput in real-world deployments, emphasizing their potential for energy-efficient monitoring systems. This research contributes to advancing SHM and UAV obstacle avoidance through SNNs' efficient information processing and decision-making capabilities within CPS domains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nella letteratura economica e di teoria dei giochi vi è un dibattito aperto sulla possibilità di emergenza di comportamenti anticompetitivi da parte di algoritmi di determinazione automatica dei prezzi di mercato. L'obiettivo di questa tesi è sviluppare un modello di reinforcement learning di tipo actor-critic con entropy regularization per impostare i prezzi in un gioco dinamico di competizione oligopolistica con prezzi continui. Il modello che propongo esibisce in modo coerente comportamenti cooperativi supportati da meccanismi di punizione che scoraggiano la deviazione dall'equilibrio raggiunto a convergenza. Il comportamento di questo modello durante l'apprendimento e a convergenza avvenuta aiuta inoltre a interpretare le azioni compiute da Q-learning tabellare e altri algoritmi di prezzo in condizioni simili. I risultati sono robusti alla variazione del numero di agenti in competizione e al tipo di deviazione dall'equilibrio ottenuto a convergenza, punendo anche deviazioni a prezzi più alti.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis is focused on the design of a flexible, dynamic and innovative telecommunication's system for future 6G applications on vehicular communications. The system is based on the development of drones acting as mobile base stations in an urban scenario to cope with the increasing traffic demand and avoid network's congestion conditions. In particular, the exploitation of Reinforcement Learning algorithms is used to let the drone learn autonomously how to behave in a scenario full of obstacles with the goal of tracking and serve the maximum number of moving vehicles, by at the same time, minimizing the energy consumed to perform its tasks. This project is an extraordinary opportunity to open the doors to a new way of applying and develop telecommunications in an urban scenario by mixing it to the rising world of the Artificial Intelligence.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research intended to investigate the use of diazepam in conjunction with behavioral strategies to manage uncooperative behavior of child dental patients. The 6 participants received dental treatment during 9 sessions. Using a double-blind design, children received placebo or diazepam and at the same time were submitted to behavior management produces (distraction, explanation, reinforcement and set rule and limits). All sessions were recorded in video-tapes biped in 15 seconds intervals, in which observers recorded child's (crying, body and/or head movements, escape and avoidance) and dentist's behavior. The results indicated that diazepam, considering the used dose, was only effective with one subject. The other participants didn't permit the treatment and showed an increase in their resistance. The behavioral preparation strategies for dental treatment should have been more precisely planned in order to help the child to face the real dental treatment conditions mainly in the first sessions avoiding to reinforce inappropriate behaviors.