915 resultados para compression reinforcement
We demonstrate numerically light-pulse combining and pulse compression using wave-collapse (self-focusing) energy-localization dynamics in a continuous-discrete nonlinear system, as implemented in a multicore fiber (MCF) using one-dimensional (1D) and 2D core distribution designs. Large-scale numerical simulations were performed to determine the conditions of the most efficient coherent combining and compression of pulses injected into the considered MCFs. We demonstrate the possibility of combining in a single core 90% of the total energy of pulses initially injected into all cores of a 7-core MCF with a hexagonal lattice. A pulse compression factor of about 720 can be obtained with a 19-core ring MCF.
Traditional knowledge associated with genetic resources (TKaGRs) is acknowledged as a valuable resource. Its value draws from economic, social, cultural, and innovative uses. This value places TK at the heart of competing interests as between indigenous peoples who hold it and depend on it for their survival, and profitable industries which seek to exploit it in the global market space. The latter group seek, inter alia, to advance and maintain their global competitiveness by exploiting TKaGRs leads in their research and development activities connected with modern innovation. Biopiracy remains an issue of central concern to the developing world and has emerged in this context as a label for the inequity arising from the misappropriation of TKaGRs located in the South by commercial interests usually located in the North. Significant attention and resources are being channeled at global efforts to design and implement effective protection mechanisms for TKaGRs against the incidence of biopiracy. The emergence and recent entry into force of the Nagoya Protocol offers the latest example of a concluded multilateral effort in this regard. The Nagoya Protocol, adopted on the platform of the Convention on Biological Diversity (CBD), establishes an open-ended international access and benefit sharing (ABS) regime which is comprised of the Protocol as well as several complementary instruments. By focusing on the trans-regime nature of biopiracy, this thesis argues that the intellectual property (IP) system forms a central part of the problem of biopiracy, and so too to the very efforts to implement solutions, including through the Nagoya Protocol. The ongoing related work within the World Intellectual Property Organization (WIPO), aimed at developing an international instrument (or a series of instruments) to address the effective protection of TK, constitutes an essential complementary process to the Nagoya Protocol, and, as such, forms a fundamental element within the Nagoya Protocol’s evolving ABS regime-complex. By adopting a third world approach to international law, this thesis draws central significance from its reconceptualization of biopiracy as a trans-regime concept. By construing the instrument(s) being negotiated within WIPO as forming a central component part of the Nagoya Protocol, this dissertation’s analysis highlights the importance of third world efforts to secure an IP-based reinforcement to the Protocol for the effective eradication of biopiracy.
L’objectif essentiel de cette thèse est de développer un système industriel de réfrigération ou de climatisation qui permet la conversion du potentiel de l’énergie solaire en production du froid. Ce système de réfrigération est basé sur la technologie de l’éjecto-compression qui propose la compression thermique comme alternative économique à la compression mécanique coûteuse. Le sous-système de réfrigération utilise un appareil statique fiable appelé éjecteur actionné seulement par la chaleur utile qui provient de l’énergie solaire. Il est combiné à une boucle solaire composée entre autres de capteurs solaires cylindro-paraboliques à concentration. Cette combinaison a pour objectif d’atteindre des efficacités énergétiques et exergétiques globales importantes. Le stockage thermique n’est pas considéré dans ce travail de thèse mais sera intégré au système dans des perspectives futures. En première étape, un nouveau modèle numérique et thermodynamique d’un éjecteur monophasique a été développé. Ce modèle de design applique les conditions d’entrée des fluides (pression, température et vitesse) et leur débit. Il suppose que le mélange se fait à pression constante et que l’écoulement est subsonique à l’entrée du diffuseur. Il utilise un fluide réel (R141b) et la pression de sortie est imposée. D’autre part, il intègre deux innovations importantes : il utilise l'efficacité polytropique constante (plutôt que des efficacités isentropiques constantes utilisées souvent dans la littérature) et n’impose pas une valeur fixe de l'efficacité du mélange, mais la détermine à partir des conditions d'écoulement calculées. L’efficacité polytropique constante est utilisée afin de quantifier les irréversibilités au cours des procédés d’accélérations et de décélération comme dans les turbomachines. La validation du modèle numérique de design a été effectuée à l’aide d’une étude expérimentale présente dans la littérature. La seconde étape a pour but de proposer un modèle numérique basé sur des données expérimentales de la littérature et compatible à TRNSYS et un autre modèle numérique EES destinés respectivement au capteur solaire cylindro-parabolique et au sous-système de réfrigération à éjecteur. En définitive et après avoir développé les modèles numériques et thermodynamiques, une autre étude a proposé un modèle pour le système de réfrigération solaire à éjecteur intégrant ceux de ses composantes. Plusieurs études paramétriques ont été entreprises afin d’évaluer les effets de certains paramètres (surchauffe du réfrigérant, débit calorifique du caloporteur et rayonnement solaire) sur sa performance. La méthodologie proposée est basée sur les lois de la thermodynamique classique et sur les relations de la thermodynamique aux dimensions finies. De nouvelles analyses exergétiques basées sur le concept de l’exergie de transit ont permis l'évaluation de deux indicateurs thermodynamiquement importants : l’exergie produite et l’exergie consommée dont le rapport exprime l’efficacité exergétique intrinsèque. Les résultats obtenus à partir des études appliquées à l’éjecteur et au système global montrent que le calcul traditionnel de l’efficacité exergétique selon Grassmann n’est désormais pas un critère pertinent pour l'évaluation de la performance thermodynamique des éjecteurs pour les systèmes de réfrigération.
That humans and animals learn from interaction with the environment is a foundational idea underlying nearly all theories of learning and intelligence. Learning that certain outcomes are associated with specific actions or stimuli (both internal and external), is at the very core of the capacity to adapt behaviour to environmental changes. In the present work, appetitive and aversive reinforcement learning paradigms have been used to investigate the fronto-striatal loops and behavioural correlates of adaptive and maladaptive reinforcement learning processes, aiming to a deeper understanding of how cortical and subcortical substrates interacts between them and with other brain systems to support learning. By combining a large variety of neuroscientific approaches, including behavioral and psychophysiological methods, EEG and neuroimaging techniques, these studies aim at clarifying and advancing the knowledge of the neural bases and computational mechanisms of reinforcement learning, both in normal and neurologically impaired population.
Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo realizzato un applicativo chiamato Service Broker, il quale, attraverso il reinforcement learning, apprende come distribuire l'esecuzione di tasks su dei lavoratori asincroni e distribuiti. L'analogia è quella di un cloud data center, il quale possiede delle risorse interne - possibilmente distribuite nella server farm -, riceve dei tasks dai suoi clienti e li esegue su queste risorse. L'obiettivo dell'applicativo, e quindi del data center, è quello di allocare questi tasks in maniera da minimizzare il costo di esecuzione. Inoltre, al fine di testare gli agenti del reinforcement learning sviluppati è stato creato un environment, un simulatore, che permettesse di concentrarsi nello sviluppo dei componenti necessari agli agenti, invece che doversi anche occupare di eventuali aspetti implementativi necessari in un vero data center, come ad esempio la comunicazione con i vari nodi e i tempi di latenza di quest'ultima. I risultati ottenuti hanno dunque confermato la teoria studiata, riuscendo a ottenere prestazioni migliori di alcuni dei metodi classici per il task allocation.
The increased exploitation of carbon fiber reinforced polymers (CFRP) is inevitably bringing about an increase in production scraps and end-of-life components, resulting in a sharp increase in CFRP waste. Therefore, it is of paramount importance to find efficient ways to reintroduce waste into the manufacturing cycle. At present, several recycling methods for treating CFRPs are available, even if all of them still have to be optimized. The step after CFRP recycling, and also the key to build a solid and sustainable CFRP recycling market, is represented by the utilization of Re-CFs. The smartest way to utilize recovered carbon fibers is through the manufacturing of recycled CFRPs, that can be done by re-impregnating the recovered fibers with a new polymeric matrix. Fused Filament Fabrication (FFF) is one of the most widely used additive manufacturing (3D printing) techniques that fabricates parts with a polymeric filament deposition process that allows to produce parts adding material layer-by-layer, only where it is needed, saving energy, raw material cost, and waste. The filament can also contain fillers or reinforcements such as recycled short carbon fibers and this makes it perfectly compliant with the re-application of the shortened recycled CF. Therefore, in this thesis work recycled and virgin carbon fiber reinforced PLA filaments have been initially produced using 5% and 10% of CFs load. Properties and characteristics of the filaments have been determined conducting different analysis (TGA, DMA, DSC). Subsequently the 5%wt. Re-CFs filament has been used to 3D print specimens for mechanical characterization (DMA, tensile test and CTE), in order to evaluate properties of printed PLA composites containing Re-CFs and evaluate the feasibility of Re-CFs in 3D printing application.
Nella prima parte del mio lavoro viene presentato uno studio di una prima soluzione "from scratch" sviluppata da Andrew Karpathy. Seguono due miei miglioramenti: il primo modificando direttamente il codice della precedente soluzione e introducendo, come obbiettivo aggiuntivo per la rete nelle prime fasi di gioco, l'intercettazione della pallina da parte della racchetta, migliorando l'addestramento iniziale; il secondo é una mia personale implementazione utilizzando algoritmi più complessi, che sono allo stato dell'arte su giochi dell'Atari, e che portano un addestramento molto più veloce della rete.
Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinforcement learning in particular has never been applied to such cases before. Being able to successfully make use of reinforcement learning in an encrypted scenario would allow us to create an agent that efficiently controls a system without providing it with full knowledge of the environment it is operating in, leading the way to many possible use cases. Therefore, we have decided to apply the reinforcement learning paradigm to encrypted data. In this project we have applied one of the most well-known reinforcement learning algorithms, called Deep Q-Learning, to simple simulated environments and studied how the encryption affects the training performance of the agent, in order to see if it is still able to learn how to behave even when the input data is no longer readable by humans. The results of this work highlight that the agent is still able to learn with no issues whatsoever in small state spaces with non-secure encryptions, like AES in ECB mode. For fixed environments, it is also able to reach a suboptimal solution even in the presence of secure modes, like AES in CBC mode, showing a significant improvement with respect to a random agent; however, its ability to generalize in stochastic environments or big state spaces suffers greatly.
Reinforcement Learning is an increasingly popular area of Artificial Intelligence. The applications of this learning paradigm are many, but its application in mobile computing is in its infancy. This study aims to provide an overview of current Reinforcement Learning applications on mobile devices, as well as to introduce a new framework for iOS devices: Swift-RL Lib. This new Swift package allows developers to easily support and integrate two of the most common RL algorithms, Q-Learning and Deep Q-Network, in a fully customizable environment. All processes are performed on the device, without any need for remote computation. The framework was tested in different settings and evaluated through several use cases. Through an in-depth performance analysis, we show that the platform provides effective and efficient support for Reinforcement Learning for mobile applications.
Reinforcement Learning (RL) provides a powerful framework to address sequential decision-making problems in which the transition dynamics is unknown or too complex to be represented. The RL approach is based on speculating what is the best decision to make given sample estimates obtained from previous interactions, a recipe that led to several breakthroughs in various domains, ranging from game playing to robotics. Despite their success, current RL methods hardly generalize from one task to another, and achieving the kind of generalization obtained through unsupervised pre-training in non-sequential problems seems unthinkable. Unsupervised RL has recently emerged as a way to improve generalization of RL methods. Just as its non-sequential counterpart, the unsupervised RL framework comprises two phases: An unsupervised pre-training phase, in which the agent interacts with the environment without external feedback, and a supervised fine-tuning phase, in which the agent aims to efficiently solve a task in the same environment by exploiting the knowledge acquired during pre-training. In this thesis, we study unsupervised RL via state entropy maximization, in which the agent makes use of the unsupervised interactions to pre-train a policy that maximizes the entropy of its induced state distribution. First, we provide a theoretical characterization of the learning problem by considering a convex RL formulation that subsumes state entropy maximization. Our analysis shows that maximizing the state entropy in finite trials is inherently harder than RL. Then, we study the state entropy maximization problem from an optimization perspective. Especially, we show that the primal formulation of the corresponding optimization problem can be (approximately) addressed through tractable linear programs. Finally, we provide the first practical methodologies for state entropy maximization in complex domains, both when the pre-training takes place in a single environment as well as multiple environments.
The design process of any electric vehicle system has to be oriented towards the best energy efficiency, together with the constraint of maintaining comfort in the vehicle cabin. Main aim of this study is to research the best thermal management solution in terms of HVAC efficiency without compromising occupant’s comfort and internal air quality. An Arduino controlled Low Cost System of Sensors was developed and compared against reference instrumentation (average R-squared of 0.92) and then used to characterise the vehicle cabin in real parking and driving conditions trials. Data on the energy use of the HVAC was retrieved from the car On-Board Diagnostic port. Energy savings using recirculation can reach 30 %, but pollutants concentration in the cabin builds up in this operating mode. Moreover, the temperature profile appeared strongly nonuniform with air temperature differences up to 10° C. Optimisation methods often require a high number of runs to find the optimal configuration of the system. Fast models proved to be beneficial for these task, while CFD-1D model are usually slower despite the higher level of detail provided. In this work, the collected dataset was used to train a fast ML model of both cabin and HVAC using linear regression. Average scaled RMSE over all trials is 0.4 %, while computation time is 0.0077 ms for each second of simulated time on a laptop computer. Finally, a reinforcement learning environment was built in OpenAI and Stable-Baselines3 using the built-in Proximal Policy Optimisation algorithm to update the policy and seek for the best compromise between comfort, air quality and energy reward terms. The learning curves show an oscillating behaviour overall, with only 2 experiments behaving as expected even if too slow. This result leaves large room for improvement, ranging from the reward function engineering to the expansion of the ML model.
Spiking Neural Networks (SNNs) are bio-inspired Artificial Neural Networks (ANNs) utilizing discrete spiking signals, akin to neuron communication in the brain, making them ideal for real-time and energy-efficient Cyber-Physical Systems (CPSs). This thesis explores their potential in Structural Health Monitoring (SHM), leveraging low-cost MEMS accelerometers for early damage detection in motorway bridges. The study focuses on Long Short-Term SNNs (LSNNs), although their complex learning processes pose challenges. Comparing LSNNs with other ANN models and training algorithms for SHM, findings indicate LSNNs' effectiveness in damage identification, comparable to ANNs trained using traditional methods. Additionally, an optimized embedded LSNN implementation demonstrates a 54% reduction in execution time, but with longer pre-processing due to spike-based encoding. Furthermore, SNNs are applied in UAV obstacle avoidance, trained directly using a Reinforcement Learning (RL) algorithm with event-based input from a Dynamic Vision Sensor (DVS). Performance evaluation against Convolutional Neural Networks (CNNs) highlights SNNs' superior energy efficiency, showing a 6x decrease in energy consumption. The study also investigates embedded SNN implementations' latency and throughput in real-world deployments, emphasizing their potential for energy-efficient monitoring systems. This research contributes to advancing SHM and UAV obstacle avoidance through SNNs' efficient information processing and decision-making capabilities within CPS domains.
Nella letteratura economica e di teoria dei giochi vi è un dibattito aperto sulla possibilità di emergenza di comportamenti anticompetitivi da parte di algoritmi di determinazione automatica dei prezzi di mercato. L'obiettivo di questa tesi è sviluppare un modello di reinforcement learning di tipo actor-critic con entropy regularization per impostare i prezzi in un gioco dinamico di competizione oligopolistica con prezzi continui. Il modello che propongo esibisce in modo coerente comportamenti cooperativi supportati da meccanismi di punizione che scoraggiano la deviazione dall'equilibrio raggiunto a convergenza. Il comportamento di questo modello durante l'apprendimento e a convergenza avvenuta aiuta inoltre a interpretare le azioni compiute da Q-learning tabellare e altri algoritmi di prezzo in condizioni simili. I risultati sono robusti alla variazione del numero di agenti in competizione e al tipo di deviazione dall'equilibrio ottenuto a convergenza, punendo anche deviazioni a prezzi più alti.
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transformer architectures achieved impressive results in almost any NLP task, such as Text Classification, Machine Translation, and Language Generation. As time went by, transformers continued to improve thanks to larger corpora and bigger networks, reaching hundreds of billions of parameters. Training and deploying such large models has become prohibitively expensive, such that only big high tech companies can afford to train those models. Therefore, a lot of research has been dedicated to reducing a model’s size. In this thesis, we investigate the effects of Vocabulary Transfer and Knowledge Distillation for compressing large Language Models. The goal is to combine these two methodologies to further compress models without significant loss of performance. In particular, we designed different combination strategies and conducted a series of experiments on different vertical domains (medical, legal, news) and downstream tasks (Text Classification and Named Entity Recognition). Four different methods involving Vocabulary Transfer (VIPI) with and without a Masked Language Modelling (MLM) step and with and without Knowledge Distillation are compared against a baseline that assigns random vectors to new elements of the vocabulary. Results indicate that VIPI effectively transfers information of the original vocabulary and that MLM is beneficial. It is also noted that both vocabulary transfer and knowledge distillation are orthogonal to one another and may be applied jointly. The application of knowledge distillation first before subsequently applying vocabulary transfer is recommended. Finally, model performance due to vocabulary transfer does not always show a consistent trend as the vocabulary size is reduced. Hence, the choice of vocabulary size should be empirically selected by evaluation on the downstream task similar to hyperparameter tuning.
Il fenomeno noto come Internet of Things costituisce oggi il motore principale dell'espansione della rete Internet globale, essendo artefice del collegamento di miliardi di nuovi dispositivi. A causa delle limitate capacità energetiche e di elaborazione di questi dispositivi è necessario riprogettare molti dei protocolli Internet standard. Un esempio lampante è costituito dalla definizione del Constrained Application Protocol (CoAP), protocollo di comunicazione client-server pensato per sostituire HTTP in reti IoT. Per consentire la compatibilità tra reti IoT e rete Internet sono state definite delle linee guida per la mappatura di messaggi CoAP in messaggi HTTP e viceversa, consentendo così l'implementazione di proxies in grado di connettere una rete IoT ad Internet. Tuttavia, questa mappatura è circoscritta ai soli campi e messaggi che permettono di implementare un'architettura REST, rendendo dunque impossibile l'uso di protocolli di livello applicazione basati su HTTP.La soluzione proposta consiste nella definizione di un protocollo di compressione adattiva dei messaggi HTTP, in modo che soluzioni valide fuori dagli scenari IoT, come ad esempio scambio di messaggi generici, possano essere implementate anche in reti IoT. I risultati ottenuti mostrano inoltre che nello scenario di riferimento la compressione adattiva di messaggi HTTP raggiunge prestazioni inferiori rispetto ad altri algoritmi di compressione di intestazioni (in particolare HPACK), ma più che valide perchè le uniche applicabili attualmente in scenari IoT.