953 resultados para tunnel reinforcement
Resumo:
Reinforcement Learning is an increasingly popular area of Artificial Intelligence. The applications of this learning paradigm are many, but its application in mobile computing is in its infancy. This study aims to provide an overview of current Reinforcement Learning applications on mobile devices, as well as to introduce a new framework for iOS devices: Swift-RL Lib. This new Swift package allows developers to easily support and integrate two of the most common RL algorithms, Q-Learning and Deep Q-Network, in a fully customizable environment. All processes are performed on the device, without any need for remote computation. The framework was tested in different settings and evaluated through several use cases. Through an in-depth performance analysis, we show that the platform provides effective and efficient support for Reinforcement Learning for mobile applications.
Resumo:
Reinforcement Learning (RL) provides a powerful framework to address sequential decision-making problems in which the transition dynamics is unknown or too complex to be represented. The RL approach is based on speculating what is the best decision to make given sample estimates obtained from previous interactions, a recipe that led to several breakthroughs in various domains, ranging from game playing to robotics. Despite their success, current RL methods hardly generalize from one task to another, and achieving the kind of generalization obtained through unsupervised pre-training in non-sequential problems seems unthinkable. Unsupervised RL has recently emerged as a way to improve generalization of RL methods. Just as its non-sequential counterpart, the unsupervised RL framework comprises two phases: An unsupervised pre-training phase, in which the agent interacts with the environment without external feedback, and a supervised fine-tuning phase, in which the agent aims to efficiently solve a task in the same environment by exploiting the knowledge acquired during pre-training. In this thesis, we study unsupervised RL via state entropy maximization, in which the agent makes use of the unsupervised interactions to pre-train a policy that maximizes the entropy of its induced state distribution. First, we provide a theoretical characterization of the learning problem by considering a convex RL formulation that subsumes state entropy maximization. Our analysis shows that maximizing the state entropy in finite trials is inherently harder than RL. Then, we study the state entropy maximization problem from an optimization perspective. Especially, we show that the primal formulation of the corresponding optimization problem can be (approximately) addressed through tractable linear programs. Finally, we provide the first practical methodologies for state entropy maximization in complex domains, both when the pre-training takes place in a single environment as well as multiple environments.
Resumo:
The design process of any electric vehicle system has to be oriented towards the best energy efficiency, together with the constraint of maintaining comfort in the vehicle cabin. Main aim of this study is to research the best thermal management solution in terms of HVAC efficiency without compromising occupant’s comfort and internal air quality. An Arduino controlled Low Cost System of Sensors was developed and compared against reference instrumentation (average R-squared of 0.92) and then used to characterise the vehicle cabin in real parking and driving conditions trials. Data on the energy use of the HVAC was retrieved from the car On-Board Diagnostic port. Energy savings using recirculation can reach 30 %, but pollutants concentration in the cabin builds up in this operating mode. Moreover, the temperature profile appeared strongly nonuniform with air temperature differences up to 10° C. Optimisation methods often require a high number of runs to find the optimal configuration of the system. Fast models proved to be beneficial for these task, while CFD-1D model are usually slower despite the higher level of detail provided. In this work, the collected dataset was used to train a fast ML model of both cabin and HVAC using linear regression. Average scaled RMSE over all trials is 0.4 %, while computation time is 0.0077 ms for each second of simulated time on a laptop computer. Finally, a reinforcement learning environment was built in OpenAI and Stable-Baselines3 using the built-in Proximal Policy Optimisation algorithm to update the policy and seek for the best compromise between comfort, air quality and energy reward terms. The learning curves show an oscillating behaviour overall, with only 2 experiments behaving as expected even if too slow. This result leaves large room for improvement, ranging from the reward function engineering to the expansion of the ML model.
Resumo:
Spiking Neural Networks (SNNs) are bio-inspired Artificial Neural Networks (ANNs) utilizing discrete spiking signals, akin to neuron communication in the brain, making them ideal for real-time and energy-efficient Cyber-Physical Systems (CPSs). This thesis explores their potential in Structural Health Monitoring (SHM), leveraging low-cost MEMS accelerometers for early damage detection in motorway bridges. The study focuses on Long Short-Term SNNs (LSNNs), although their complex learning processes pose challenges. Comparing LSNNs with other ANN models and training algorithms for SHM, findings indicate LSNNs' effectiveness in damage identification, comparable to ANNs trained using traditional methods. Additionally, an optimized embedded LSNN implementation demonstrates a 54% reduction in execution time, but with longer pre-processing due to spike-based encoding. Furthermore, SNNs are applied in UAV obstacle avoidance, trained directly using a Reinforcement Learning (RL) algorithm with event-based input from a Dynamic Vision Sensor (DVS). Performance evaluation against Convolutional Neural Networks (CNNs) highlights SNNs' superior energy efficiency, showing a 6x decrease in energy consumption. The study also investigates embedded SNN implementations' latency and throughput in real-world deployments, emphasizing their potential for energy-efficient monitoring systems. This research contributes to advancing SHM and UAV obstacle avoidance through SNNs' efficient information processing and decision-making capabilities within CPS domains.
Resumo:
This work presents the case of the San Lorenzo road tunnel, a transportation infrastructure located in the northern part of Italy, involved in the so-called Passo della Morte landslide. This tunnel crosses a large rockslide characterized by slow movements. Damages like water seepage inside the tunnel and concrete lining detachments have surfaced through the years, increasing the risk. This work develops the objective of tracing back the landslide-induced stresses directly responsible for the cracks’ pattern on the most damaged segments of the tunnel. The first section of this work gives information about the global framework: site geography and its strategic relevance, geological setting, hydrological and climate conditions will be provided. The road tunnel infrastructure and its interaction with the landslide phenomena will be discussed together with the active monitoring system, which has been working for more than 20 years. In the second part the several steps and tools used to add more details about the road damages are reported. A visualization of the actual state of the most damaged portions of the road has been reached. Then the attention has been addressed to the stresses acting on the road tunnel’s aforesaid portions, developing a FEM model of a section of the tunnel through a selected software. This latter process can be deemed as a beginning for further developments. Some preliminary results are shown to demonstrate the goodness of the assumptions made. The possible future set by this work aims at constant enlargement of information to be provided to the FEM software, and at the validation of the obtained results through the monitoring data interpretative tools.
Resumo:
Nella letteratura economica e di teoria dei giochi vi è un dibattito aperto sulla possibilità di emergenza di comportamenti anticompetitivi da parte di algoritmi di determinazione automatica dei prezzi di mercato. L'obiettivo di questa tesi è sviluppare un modello di reinforcement learning di tipo actor-critic con entropy regularization per impostare i prezzi in un gioco dinamico di competizione oligopolistica con prezzi continui. Il modello che propongo esibisce in modo coerente comportamenti cooperativi supportati da meccanismi di punizione che scoraggiano la deviazione dall'equilibrio raggiunto a convergenza. Il comportamento di questo modello durante l'apprendimento e a convergenza avvenuta aiuta inoltre a interpretare le azioni compiute da Q-learning tabellare e altri algoritmi di prezzo in condizioni simili. I risultati sono robusti alla variazione del numero di agenti in competizione e al tipo di deviazione dall'equilibrio ottenuto a convergenza, punendo anche deviazioni a prezzi più alti.
Resumo:
Linear cascade testing serves a fundamental role in the research, development, and design of turbomachines as it is a simple yet very effective way to compute the performance of a generic blade geometry. These kinds of experiments are usually carried out in specialized wind tunnel facilities. This thesis deals with the numerical characterization and subsequent partial redesign of the S-1/C Continuous High Speed Wind Tunnel of the Von Karman Institute for Fluid Dynamics. The current facility is powered by a 13-stage axial compressor that is not powerful enough to balance the energy loss experienced when testing low turning airfoils. In order to address this issue a performance assessment of the wind tunnel was performed under several flow regimes via numerical simulations. After that, a redesign proposal aimed at reducing the pressure loss was investigated. This consists of a linear cascade of turning blades to be placed downstream of the test section and designed specifically for the type of linear cascade being tested. An automatic design procedure was created taking as input parameters those measured at the outlet of the cascade. The parametrization method employed Bézier curves to produce an airfoil geometry that could be imported into a CAD software so that a cascade could be designed. The proposal was simulated via CFD analysis and proved to be effective in reducing pressure losses up to 41%. The same tool developed in this thesis could be adopted to design similar apparatuses and could also be optimized and specialized for the design of turbomachines components.
Resumo:
This thesis is focused on the design of a flexible, dynamic and innovative telecommunication's system for future 6G applications on vehicular communications. The system is based on the development of drones acting as mobile base stations in an urban scenario to cope with the increasing traffic demand and avoid network's congestion conditions. In particular, the exploitation of Reinforcement Learning algorithms is used to let the drone learn autonomously how to behave in a scenario full of obstacles with the goal of tracking and serve the maximum number of moving vehicles, by at the same time, minimizing the energy consumed to perform its tasks. This project is an extraordinary opportunity to open the doors to a new way of applying and develop telecommunications in an urban scenario by mixing it to the rising world of the Artificial Intelligence.
Resumo:
The scope of this study is to design an automatic control system and create an automatic x-wire calibrator for a facility named Plane Air Tunnel; whose exit creates planar jet flow. The controlling power state as well as automatic speed adjustment of the inverter has been achieved. Thus, the wind tunnel can be run with respect to any desired speed and the x-wire can automatically be calibrated at that speed. To achieve that, VI programming using the LabView environment was learned, to acquire the pressure and temperature, and to calculate the velocity based on the acquisition data thanks to a pitot-static tube. Furthermore, communication with the inverter to give the commands for power on/off and speed control was also done using the LabView VI coding environment. The connection of the computer to the inverter was achieved by the proper cabling using DAQmx Analog/Digital (A/D) input/output (I/O). Moreover, the pressure profile along the streamwise direction of the plane air tunnel was studied. Pressure tappings and a multichannel pressure scanner were used to acquire the pressure values at different locations. Thanks to that, the aerodynamic efficiency of the contraction ratio was observed, and the pressure behavior was related to the velocity at the exit section. Furthermore, the control of the speed was accomplished by implementing a closed-loop PI controller on the LabView environment with and without using a pitot-static tube thanks to the pressure behavior information. The responses of the two controllers were analyzed and commented on by giving suggestions. In addition, hot wire experiments were performed to calibrate automatically and investigate the velocity profile of a turbulent planar jet. To be able to analyze the results, the physics of turbulent planar jet flow was studied. The fundamental terms, the methods used in the derivation of the equations, velocity profile, shear stress behavior, and the effect of vorticity were reviewed.
Resumo:
Air quality in animal production environment has been refereed as an interesting point for studies in environmental control systems with the focus both to the animal health which live in total confinement, as to the workers. The objective of this research was to determine the variation on the aerial environmental quality in two types of broiler housing: conventional (Gc) and tunnel type (Gt). The total dust values in both houses offered adequate rearing conditions to the birds; however, regarding the inhale dust in the air was above the limits recommended for humans. Carbon monoxide concentration in the heating phase during the evaluated period was above the 10 ppm maximum recommended, and it was higher during the cold season in Gt house (30 ppm) when compared to the Gc house (18 ppm). Ammonia concentration peaks in the air were above the 20 ppm recommended from the 20th day of production in both houses and in daily average, for a period higher in Gt (4h30) when compared to Gt (2h45). Only traces of nitrate oxide and methane were found while carbonic dioxide gas concentration evaluated during daytime met the limits allowed for both birds and labor.
Resumo:
This research intended to investigate the use of diazepam in conjunction with behavioral strategies to manage uncooperative behavior of child dental patients. The 6 participants received dental treatment during 9 sessions. Using a double-blind design, children received placebo or diazepam and at the same time were submitted to behavior management produces (distraction, explanation, reinforcement and set rule and limits). All sessions were recorded in video-tapes biped in 15 seconds intervals, in which observers recorded child's (crying, body and/or head movements, escape and avoidance) and dentist's behavior. The results indicated that diazepam, considering the used dose, was only effective with one subject. The other participants didn't permit the treatment and showed an increase in their resistance. The behavioral preparation strategies for dental treatment should have been more precisely planned in order to help the child to face the real dental treatment conditions mainly in the first sessions avoiding to reinforce inappropriate behaviors.
Resumo:
The present investigation evaluated the effects of diazepam used to manage uncooperative behavior of child dental patients. Six participants received placebo or diazepam (0,3 mg/kg weight) before formal dental treatment at total 54 sessions that were all recorded in videotapes. The analysis of recorded child (crying, body and/or head movements, escape and avoidance) and dentist's behavior management procedures (distraction, explanation, positive reinforcement) indicates no differences by using a double-blind Wilcoxon design (p>0.05). It is suggested the necessity of methodological refinement in studies that combine psychological and pharmacological handling strategies.
Resumo:
Universidade Estadual de Campinas. Faculdade de Educação Física
Resumo:
Despite the advances in bonding materials, many clinicians today still prefer to place bands on molar teeth. Molar bonding procedures need improvement to be widely accepted clinically. OBJECTIVE: The purpose of this study was to evaluate the shear bond strength when an additional adhesive layer was applied on the occlusal tooth/tube interface to provide reinforcement to molar tubes. MATERIAL AND METHODS: Sixty third molars were selected and allocated to the 3 groups: group 1 received a conventional direct bond followed by the application of an additional layer of adhesive on the occlusal tooth/tube interface, group 2 received a conventional direct bond, and group 3 received a conventional direct bond and an additional cure time of 10 s. The specimens were debonded in a universal testing machine. The results were analyzed statistically by ANOVA and Tukey's test (α=0.05). RESULTS: Group 1 had a significantly higher (p<0.05) shear bond strength compared to groups 2 and 3. No difference was detected between groups 2 and 3 (p>0.05). CONCLUSIONS: The present in vitro findings indicate that the application of an additional layer of adhesive on the tooth/tube interface increased the shear bond strength of the bonded molar tubes.
Resumo:
Os avanços nos cuidados com o paciente traumatizado e com infecções abdominais graves são responsáveis por um número crescente de peritoneostomias. O manejo desta entidade é complexo e várias técnicas foram descritas para seu tratamento. Recentemente foi introduzido na literatura o conceito de fechamento dinâmico da parede abdominal, com elevadas taxas de sucesso. O objetivo deste trabalho é de servir como nota prévia de uma nova abordagem para o tratamento das peritoneostomias, desenvolvida no Hospital Universitário da Universidade de São Paulo. Trata-se de um procedimento simples e de baixo custo, facilmente realizado por cirurgião geral. O procedimento também foi utilizado como reforço em fechamentos abdominais tensos, de maneira profilática. O procedimento é descrito em detalhes, assim como os resultados nos primeiros pacientes. Apesar de promissora, refinamentos técnicos e estudos complementares são necessários para a validação da técnica.