953 resultados para tunnel reinforcement


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take advantage of the underlying episodic structure of collected batch of interactions. The Q-Batch update-rule is proposed in this thesis, to process experiencies along the trajectories collected in the interaction phase. This allows a faster propagation of obtained rewards and penalties, resulting in faster and more robust learning. Non-parametric function approximations are explored, such as Gaussian Processes. This type of approximators allows to encode prior knowledge about the latent function, in the form of kernels, providing a higher level of exibility and accuracy. The application of Gaussian Processes in Batch Reinforcement Learning presented a higher performance in learning tasks than other function approximations used in the literature. Lastly, in order to extract more information from the experiences collected by the agent, model-learning techniques are incorporated to learn the system dynamics. In this way, it is possible to augment the set of collected experiences with experiences generated through planning using the learned models. Experiments were carried out mainly in simulation, with some tests carried out in a physical robotic platform. The obtained results show that the proposed approaches are able to outperform the classical Fitted Q Iteration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simulations of droplet dispersion behind cylinder wakes and downstream of icing tunnel spray bars were conducted. In both cases, a range of droplet sizes were investigated numerically with a Lagrangian particle trajectory approach while the turbulent air flow was investigated with a hybrid Reynolds-Averaged Navier-Stokes/Large-Eddy Simulations approach scheme. In the first study, droplets were injected downstream of a cylinder at sub-critical conditions (i.e. with laminar boundary layer separation). A stochastic continuous random walk (CRW) turbulence model was used to capture the effects of sub-grid turbulence. Small inertia droplets (characterized by small Stokes numbers) were affected by both the large-scale and small-scale vortex structures and closely followed the air flow, while exhibiting a dispersion consistent with that of a scalar flow field. Droplets with intermediate Stokes numbers were centrifuged by the vortices to the outer edges of the wake, yielding an increased dispersion. Large Stokes number droplets were found to be less responsive to the vortex structures and exhibited the least dispersion. Particle concentration was also correlated with vorticity distribution which yielded preferential bias effects as a function of different particle sizes. This trend was qualitatively similar to results seen in homogenous isotropic turbulence, though the influence of particle inertia was less pronounced for the cylinder wake case. A similar study was completed for droplet dispersion within the Icing Research Tunnel (IRT) at the NASA Glenn Research Center, where it is important to obtain a nearly uniform liquid water content (LWC) distribution in the test section (to recreate atmospheric icing conditions).. For this goal, droplets are diffused by the mean and turbulent flow generated from the nozzle air jets, from the upstream spray bars, and from the vertical strut wakes. To understand the influence of these three components, a set of simulations was conducted with a sequential inclusion of these components. Firstly, a jet in an otherwise quiescent airflow was simulated to capture the impact of the air jet on flow turbulence and droplet distribution, and the predictions compared well with experimental results. The effects of the spray bar wake and vertical strut wake were then included with two more simulation conditions, for which it was found that the air jets were the primary driving force for droplet dispersion, i.e. that the spray bar and vertical strut wake effects were secondary.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cover title.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

International audience

Relevância:

20.00% 20.00%

Publicador:

Resumo:

"This research was supported by the McDonnell Aircraft Corporation under Contract no. 6140-20 P.O. 7S4899-R."

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: This paper is a commentary to a debate article entitled: "Are we overpathologizing everyday life? A tenable blueprint for behavioral addiction research", by Billieux et al. (2015). Methods and aim: This brief response focused on the necessity to better characterize psychological and related neurocognitive determinants of persistent deleterious actions associated or not with substance utilization. Results: A majority of addicted people could be driven by psychological functional reasons to keep using drugs, gambling or buying despite the growing number of related negative consequences. In addition, a non-negligible proportion of them would need assistance to restore profound disturbances in basic learning processes involved in compulsive actions. Conclusions: The distinction between psychological functionality and compulsive aspects of addictive behaviors should represent a big step towards more efficient treatments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cigarette smoking remains the leading preventable cause of death and disability in the United States and most often is initiated during adolescence. An emerging body of research suggests that a negative reinforcement model may explain factors that contribute to tobacco use during adolescence and that negative reinforcement processes may contribute to tobacco use to a greater extent among female adolescents than among male adolescents. However, the extant literature both on the relationship between negative reinforcement processes and adolescent tobacco use as well as on the relationship between gender, negative reinforcement processes, and adolescent tobacco use is limited by the sole reliance on self-report measures of negative reinforcement processes that may contribute to cigarette smoking. The current study aimed to further disentangle the relationships between negative reinforcement based risk taking, gender and tobacco use during older adolescence by utilizing a behavioral analogue measure of negative reinforcement based risk taking, the Maryland Resource for the Behavioral Utilization of the Reinforcement of Negative Stimuli (MRBURNS). Specifically, we examined the relationship between pumps on the MRBURNS, an indicator of risk taking, and smoking status as well as the interaction between MRBURNS pumps and gender for predicting smoking status. Participants included 103 older adolescents (n=51 smokers, 50.5% female, Age (M(SD) = 19.41(1.06)) who all attended one experimental session during which they completed the MRBURNS as well as self-report measures of tobacco use, nicotine dependence, alcohol use, depression, and anxiety. We utilized binary logistic regressions to examine the relationship between MRBURNS pumps and smoking status as well as the interactive effect of MRBURNS pumps and gender for predicting smoking status. Controlling for relevant covariates, pumps on the MRBURNS did not significantly predict smoking status and the interaction between pumps on the MRBURNS and gender also did not significantly predict smoking status. These findings highlight the importance of future research examining various task modifications to the MRBURNS as well as the need for replications of this study with larger, more diverse samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tall buildings are wind-sensitive structures and could experience high wind-induced effects. Aerodynamic boundary layer wind tunnel testing has been the most commonly used method for estimating wind effects on tall buildings. Design wind effects on tall buildings are estimated through analytical processing of the data obtained from aerodynamic wind tunnel tests. Even though it is widely agreed that the data obtained from wind tunnel testing is fairly reliable the post-test analytical procedures are still argued to have remarkable uncertainties. This research work attempted to assess the uncertainties occurring at different stages of the post-test analytical procedures in detail and suggest improved techniques for reducing the uncertainties. Results of the study showed that traditionally used simplifying approximations, particularly in the frequency domain approach, could cause significant uncertainties in estimating aerodynamic wind-induced responses. Based on identified shortcomings, a more accurate dual aerodynamic data analysis framework which works in the frequency and time domains was developed. The comprehensive analysis framework allows estimating modal, resultant and peak values of various wind-induced responses of a tall building more accurately. Estimating design wind effects on tall buildings also requires synthesizing the wind tunnel data with local climatological data of the study site. A novel copula based approach was developed for accurately synthesizing aerodynamic and climatological data up on investigating the causes of significant uncertainties in currently used synthesizing techniques. Improvement of the new approach over the existing techniques was also illustrated with a case study on a 50 story building. At last, a practical dynamic optimization approach was suggested for tuning structural properties of tall buildings towards attaining optimum performance against wind loads with less number of design iterations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional knowledge associated with genetic resources (TKaGRs) is acknowledged as a valuable resource. Its value draws from economic, social, cultural, and innovative uses. This value places TK at the heart of competing interests as between indigenous peoples who hold it and depend on it for their survival, and profitable industries which seek to exploit it in the global market space. The latter group seek, inter alia, to advance and maintain their global competitiveness by exploiting TKaGRs leads in their research and development activities connected with modern innovation. Biopiracy remains an issue of central concern to the developing world and has emerged in this context as a label for the inequity arising from the misappropriation of TKaGRs located in the South by commercial interests usually located in the North. Significant attention and resources are being channeled at global efforts to design and implement effective protection mechanisms for TKaGRs against the incidence of biopiracy. The emergence and recent entry into force of the Nagoya Protocol offers the latest example of a concluded multilateral effort in this regard. The Nagoya Protocol, adopted on the platform of the Convention on Biological Diversity (CBD), establishes an open-ended international access and benefit sharing (ABS) regime which is comprised of the Protocol as well as several complementary instruments. By focusing on the trans-regime nature of biopiracy, this thesis argues that the intellectual property (IP) system forms a central part of the problem of biopiracy, and so too to the very efforts to implement solutions, including through the Nagoya Protocol. The ongoing related work within the World Intellectual Property Organization (WIPO), aimed at developing an international instrument (or a series of instruments) to address the effective protection of TK, constitutes an essential complementary process to the Nagoya Protocol, and, as such, forms a fundamental element within the Nagoya Protocol’s evolving ABS regime-complex. By adopting a third world approach to international law, this thesis draws central significance from its reconceptualization of biopiracy as a trans-regime concept. By construing the instrument(s) being negotiated within WIPO as forming a central component part of the Nagoya Protocol, this dissertation’s analysis highlights the importance of third world efforts to secure an IP-based reinforcement to the Protocol for the effective eradication of biopiracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

That humans and animals learn from interaction with the environment is a foundational idea underlying nearly all theories of learning and intelligence. Learning that certain outcomes are associated with specific actions or stimuli (both internal and external), is at the very core of the capacity to adapt behaviour to environmental changes. In the present work, appetitive and aversive reinforcement learning paradigms have been used to investigate the fronto-striatal loops and behavioural correlates of adaptive and maladaptive reinforcement learning processes, aiming to a deeper understanding of how cortical and subcortical substrates interacts between them and with other brain systems to support learning. By combining a large variety of neuroscientific approaches, including behavioral and psychophysiological methods, EEG and neuroimaging techniques, these studies aim at clarifying and advancing the knowledge of the neural bases and computational mechanisms of reinforcement learning, both in normal and neurologically impaired population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo realizzato un applicativo chiamato Service Broker, il quale, attraverso il reinforcement learning, apprende come distribuire l'esecuzione di tasks su dei lavoratori asincroni e distribuiti. L'analogia è quella di un cloud data center, il quale possiede delle risorse interne - possibilmente distribuite nella server farm -, riceve dei tasks dai suoi clienti e li esegue su queste risorse. L'obiettivo dell'applicativo, e quindi del data center, è quello di allocare questi tasks in maniera da minimizzare il costo di esecuzione. Inoltre, al fine di testare gli agenti del reinforcement learning sviluppati è stato creato un environment, un simulatore, che permettesse di concentrarsi nello sviluppo dei componenti necessari agli agenti, invece che doversi anche occupare di eventuali aspetti implementativi necessari in un vero data center, come ad esempio la comunicazione con i vari nodi e i tempi di latenza di quest'ultima. I risultati ottenuti hanno dunque confermato la teoria studiata, riuscendo a ottenere prestazioni migliori di alcuni dei metodi classici per il task allocation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increased exploitation of carbon fiber reinforced polymers (CFRP) is inevitably bringing about an increase in production scraps and end-of-life components, resulting in a sharp increase in CFRP waste. Therefore, it is of paramount importance to find efficient ways to reintroduce waste into the manufacturing cycle. At present, several recycling methods for treating CFRPs are available, even if all of them still have to be optimized. The step after CFRP recycling, and also the key to build a solid and sustainable CFRP recycling market, is represented by the utilization of Re-CFs. The smartest way to utilize recovered carbon fibers is through the manufacturing of recycled CFRPs, that can be done by re-impregnating the recovered fibers with a new polymeric matrix. Fused Filament Fabrication (FFF) is one of the most widely used additive manufacturing (3D printing) techniques that fabricates parts with a polymeric filament deposition process that allows to produce parts adding material layer-by-layer, only where it is needed, saving energy, raw material cost, and waste. The filament can also contain fillers or reinforcements such as recycled short carbon fibers and this makes it perfectly compliant with the re-application of the shortened recycled CF. Therefore, in this thesis work recycled and virgin carbon fiber reinforced PLA filaments have been initially produced using 5% and 10% of CFs load. Properties and characteristics of the filaments have been determined conducting different analysis (TGA, DMA, DSC). Subsequently the 5%wt. Re-CFs filament has been used to 3D print specimens for mechanical characterization (DMA, tensile test and CTE), in order to evaluate properties of printed PLA composites containing Re-CFs and evaluate the feasibility of Re-CFs in 3D printing application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nella prima parte del mio lavoro viene presentato uno studio di una prima soluzione "from scratch" sviluppata da Andrew Karpathy. Seguono due miei miglioramenti: il primo modificando direttamente il codice della precedente soluzione e introducendo, come obbiettivo aggiuntivo per la rete nelle prime fasi di gioco, l'intercettazione della pallina da parte della racchetta, migliorando l'addestramento iniziale; il secondo é una mia personale implementazione utilizzando algoritmi più complessi, che sono allo stato dell'arte su giochi dell'Atari, e che portano un addestramento molto più veloce della rete.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinforcement learning in particular has never been applied to such cases before. Being able to successfully make use of reinforcement learning in an encrypted scenario would allow us to create an agent that efficiently controls a system without providing it with full knowledge of the environment it is operating in, leading the way to many possible use cases. Therefore, we have decided to apply the reinforcement learning paradigm to encrypted data. In this project we have applied one of the most well-known reinforcement learning algorithms, called Deep Q-Learning, to simple simulated environments and studied how the encryption affects the training performance of the agent, in order to see if it is still able to learn how to behave even when the input data is no longer readable by humans. The results of this work highlight that the agent is still able to learn with no issues whatsoever in small state spaces with non-secure encryptions, like AES in ECB mode. For fixed environments, it is also able to reach a suboptimal solution even in the presence of secure modes, like AES in CBC mode, showing a significant improvement with respect to a random agent; however, its ability to generalize in stochastic environments or big state spaces suffers greatly.