15 resultados para High-Performance Computing
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
The study analyses the calibration process of a newly developed high-performance plug-in hybrid electric passenger car powertrain. The complexity of modern powertrains and the more and more restrictive regulations regarding pollutant emissions are the primary challenges for the calibration of a vehicle’s powertrain. In addition, the managers of OEM need to know as earlier as possible if the vehicle under development will meet the target technical features (emission included). This leads to the necessity for advanced calibration methodologies, in order to keep the development of the powertrain robust, time and cost effective. The suggested solution is the virtual calibration, that allows the tuning of control functions of a powertrain before having it built. The aim of this study is to calibrate virtually the hybrid control unit functions in order to optimize the pollutant emissions and the fuel consumption. Starting from the model of the conventional vehicle, the powertrain is then hybridized and integrated with emissions and aftertreatments models. After its validation, the hybrid control unit strategies are optimized using the Model-in-the-Loop testing methodology. The calibration activities will proceed thanks to the implementation of a Hardware-in-the-Loop environment, that will allow to test and calibrate the Engine and Transmission control units effectively, besides in a time and cost saving manner.
Resumo:
Gaze estimation has gained interest in recent years for being an important cue to obtain information about the internal cognitive state of humans. Regardless of whether it is the 3D gaze vector or the point of gaze (PoG), gaze estimation has been applied in various fields, such as: human robot interaction, augmented reality, medicine, aviation and automotive. In the latter field, as part of Advanced Driver-Assistance Systems (ADAS), it allows the development of cutting-edge systems capable of mitigating road accidents by monitoring driver distraction. Gaze estimation can be also used to enhance the driving experience, for instance, autonomous driving. It also can improve comfort with augmented reality components capable of being commanded by the driver's eyes. Although, several high-performance real-time inference works already exist, just a few are capable of working with only a RGB camera on computationally constrained devices, such as a microcontroller. This work aims to develop a low-cost, efficient and high-performance embedded system capable of estimating the driver's gaze using deep learning and a RGB camera. The proposed system has achieved near-SOTA performances with about 90% less memory footprint. The capabilities to generalize in unseen environments have been evaluated through a live demonstration, where high performance and near real-time inference were obtained using a webcam and a Raspberry Pi4.
Resumo:
The idea of Grid Computing originated in the nineties and found its concrete applications in contexts like the SETI@home project where a lot of computers (offered by volunteers) cooperated, performing distributed computations, inside the Grid environment analyzing radio signals trying to find extraterrestrial life. The Grid was composed of traditional personal computers but, with the emergence of the first mobile devices like Personal Digital Assistants (PDAs), researchers started theorizing the inclusion of mobile devices into Grid Computing; although impressive theoretical work was done, the idea was discarded due to the limitations (mainly technological) of mobile devices available at the time. Decades have passed, and now mobile devices are extremely more performant and numerous than before, leaving a great amount of resources available on mobile devices, such as smartphones and tablets, untapped. Here we propose a solution for performing distributed computations over a Grid Computing environment that utilizes both desktop and mobile devices, exploiting the resources from day-to-day mobile users that alternatively would end up unused. The work starts with an introduction on what Grid Computing is, the evolution of mobile devices, the idea of integrating such devices into the Grid and how to convince device owners to participate in the Grid. Then, the tone becomes more technical, starting with an explanation on how Grid Computing actually works, followed by the technical challenges of integrating mobile devices into the Grid. Next, the model, which constitutes the solution offered by this study, is explained, followed by a chapter regarding the realization of a prototype that proves the feasibility of distributed computations over a Grid composed by both mobile and desktop devices. To conclude future developments and ideas to improve this project are presented.
Resumo:
Complex networks analysis is a very popular topic in computer science. Unfortunately this networks, extracted from different contexts, are usually very large and the analysis may be very complicated: computation of metrics on these structures could be very complex. Among all metrics we analyse the extraction of subnetworks called communities: they are groups of nodes that probably play the same role within the whole structure. Communities extraction is an interesting operation in many different fields (biology, economics,...). In this work we present a parallel community detection algorithm that can operate on networks with huge number of nodes and edges. After an introduction to graph theory and high performance computing, we will explain our design strategies and our implementation. Then, we will show some performance evaluation made on a distributed memory architectures i.e. the supercomputer IBM-BlueGene/Q "Fermi" at the CINECA supercomputing center, Italy, and we will comment our results.
Resumo:
In questa tesi vengono analizzati gli algoritmi DistributedSolvingSet e LazyDistributedSolvingSet e verranno mostrati dei risultati sperimentali relativi al secondo.
Resumo:
In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.
Resumo:
High Performance Computing e una tecnologia usata dai cluster computazionali per creare sistemi di elaborazione che sono in grado di fornire servizi molto piu potenti rispetto ai computer tradizionali. Di conseguenza la tecnologia HPC e diventata un fattore determinante nella competizione industriale e nella ricerca. I sistemi HPC continuano a crescere in termini di nodi e core. Le previsioni indicano che il numero dei nodi arrivera a un milione a breve. Questo tipo di architettura presenta anche dei costi molto alti in termini del consumo delle risorse, che diventano insostenibili per il mercato industriale. Un scheduler centralizzato non e in grado di gestire un numero di risorse cosi alto, mantenendo un tempo di risposta ragionevole. In questa tesi viene presentato un modello di scheduling distribuito che si basa sulla programmazione a vincoli e che modella il problema dello scheduling grazie a una serie di vincoli temporali e vincoli sulle risorse che devono essere soddisfatti. Lo scheduler cerca di ottimizzare le performance delle risorse e tende ad avvicinarsi a un profilo di consumo desiderato, considerato ottimale. Vengono analizzati vari modelli diversi e ognuno di questi viene testato in vari ambienti.
Resumo:
Since the end of the long winter of virtual reality (VR) at the beginning of the 2010 decade, many improvements have been made in terms of hardware technologies and software platforms performances and costs. Many expect such trend will continue, pushing the penetration rate of virtual reality headsets to skyrocket at some point in the future, just as mobile platforms did before. In the meantime, virtual reality is slowly transitioning from a specialized laboratory-only technology, to a consumer electronics appliance, opening interesting opportunities and challenges. In this transition, two interesting research questions amount to how 2D-based content and applications may benefit (or be hurt) by the adoption of 3D-based immersive environments and to how to proficiently support such integration. Acknowledging the relevance of the former, we here consider the latter question, focusing our attention on the diversified family of PC-based simulation tools and platforms. VR-based visualization is, in fact, widely understood and appreciated in the simulation arena, but mainly confined to high performance computing laboratories. Our contribution here aims at characterizing the simulation tools which could benefit from immersive interfaces, along with a general framework and a preliminary implementation which may be put to good use to support their transition from uniquely 2D to blended 2D/3D environments.
Resumo:
LHC experiments produce an enormous amount of data, estimated of the order of a few PetaBytes per year. Data management takes place using the Worldwide LHC Computing Grid (WLCG) grid infrastructure, both for storage and processing operations. However, in recent years, many more resources are available on High Performance Computing (HPC) farms, which generally have many computing nodes with a high number of processors. Large collaborations are working to use these resources in the most efficient way, compatibly with the constraints imposed by computing models (data distributed on the Grid, authentication, software dependencies, etc.). The aim of this thesis project is to develop a software framework that allows users to process a typical data analysis workflow of the ATLAS experiment on HPC systems. The developed analysis framework shall be deployed on the computing resources of the Open Physics Hub project and on the CINECA Marconi100 cluster, in view of the switch-on of the Leonardo supercomputer, foreseen in 2023.
Resumo:
Gli sforzi di ricerca relativi all'High Performance Computing, nel corso degli anni, hanno prodotto risultati importanti inerenti all'incremento delle prestazioni sia in termini di numero di operazioni effettuate per periodo temporale, sia introducendo o migliorando algoritmi paralleli presenti in letteratura. Tali traguardi hanno comportato cambiamenti alla struttura interna delle macchine; si è assistito infatti ad un'evoluzione delle architetture dei processori utilizzati e all'impiego di GPU come risorse di calcolo aggiuntive. La conseguenza di un continuo incremento di prestazioni è quella di dover far fronte ad un grosso dispendio energetico, in quanto le macchine impiegate nell'HPC sono ideate per effettuare un'intensa attività di calcolo in un periodo di tempo molto prolungato; l'energia necessaria per alimentare ciascun nodo e dissipare il calore generato comporta costi elevati. Tra le varie soluzioni proposte per limitare il consumo di energia, quella che ha riscosso maggior interesse, sia a livello di studio che di mercato, è stata l'integrazione di CPU di tipologia RISC (Reduced Instruction Set Computer), in quanto capaci di ottenere prestazioni soddisfacenti con un impiego energetico inferiore rispetto alle CPU CISC (Complex Instruction Set Computer). In questa tesi è presentata l'analisi delle prestazioni di Monte Cimone, un cluster composto da 8 nodi di calcolo basati su architettura RISC-V e distribuiti in 4 piattaforme (\emph{blade}) dual-board. Verranno eseguiti dei benchmark che ci permetteranno di valutare: le prestazioni dello scambio di dati a lunga e corta distanza; le prestazioni nella risoluzione di problemi che presentano un principio di località spaziale ridotto; le prestazioni nella risoluzione di problemi su grafi e, nello specifico, ricerca in ampiezza e cammini minimi da sorgente singola.
Resumo:
With the advent of high-performance computing devices, deep neural networks have gained a lot of popularity in solving many Natural Language Processing tasks. However, they are also vulnerable to adversarial attacks, which are able to modify the input text in order to mislead the target model. Adversarial attacks are a serious threat to the security of deep neural networks, and they can be used to craft adversarial examples that steer the model towards a wrong decision. In this dissertation, we propose SynBA, a novel contextualized synonym-based adversarial attack for text classification. SynBA is based on the idea of replacing words in the input text with their synonyms, which are selected according to the context of the sentence. We show that SynBA successfully generates adversarial examples that are able to fool the target model with a high success rate. We demonstrate three advantages of this proposed approach: (1) effective - it outperforms state-of-the-art attacks by semantic similarity and perturbation rate, (2) utility-preserving - it preserves semantic content, grammaticality, and correct types classified by humans, and (3) efficient - it performs attacks faster than other methods.
Resumo:
Nel presente elaborato si analizzeranno le prestazioni del linguaggio di programmazione parallela Chapel sul kernel Integer Sort di NAS Parallel Benchmarks. Questo algoritmo, a livello pratico, è utilizzato per studi o applicazioni sui metodi particellari. Saranno introdotti i concetti fondamentali di programmazione parallela e successivamente illustrate le principali caratteristiche di MPI e Chapel. Verranno poi approfonditi Integer Sort e i rispettivi dettagli implementativi, concludendo con un'analisi di prestazioni dei due linguaggi sul kernel preso in esame.
Resumo:
Modern High-Performance Computing HPC systems are gradually increasing in size and complexity due to the correspondent demand of larger simulations requiring more complicated tasks and higher accuracy. However, as side effects of the Dennard’s scaling approaching its ultimate power limit, the efficiency of software plays also an important role in increasing the overall performance of a computation. Tools to measure application performance in these increasingly complex environments provide insights into the intricate ways in which software and hardware interact. The monitoring of the power consumption in order to save energy is possible through processors interfaces like Intel Running Average Power Limit RAPL. Given the low level of these interfaces, they are often paired with an application-level tool like Performance Application Programming Interface PAPI. Since several problems in many heterogeneous fields can be represented as a complex linear system, an optimized and scalable linear system solver algorithm can decrease significantly the time spent to compute its resolution. One of the most widely used algorithms deployed for the resolution of large simulation is the Gaussian Elimination, which has its most popular implementation for HPC systems in the Scalable Linear Algebra PACKage ScaLAPACK library. However, another relevant algorithm, which is increasing in popularity in the academic field, is the Inhibition Method. This thesis compares the energy consumption of the Inhibition Method and Gaussian Elimination from ScaLAPACK to profile their execution during the resolution of linear systems above the HPC architecture offered by CINECA. Moreover, it also collates the energy and power values for different ranks, nodes, and sockets configurations. The monitoring tools employed to track the energy consumption of these algorithms are PAPI and RAPL, that will be integrated with the parallel execution of the algorithms managed with the Message Passing Interface MPI.
Resumo:
The scientific success of the LHC experiments at CERN highly depends on the availability of computing resources which efficiently store, process, and analyse the amount of data collected every year. This is ensured by the Worldwide LHC Computing Grid infrastructure that connect computing centres distributed all over the world with high performance network. LHC has an ambitious experimental program for the coming years, which includes large investments and improvements both for the hardware of the detectors and for the software and computing systems, in order to deal with the huge increase in the event rate expected from the High Luminosity LHC (HL-LHC) phase and consequently with the huge amount of data that will be produced. Since few years the role of Artificial Intelligence has become relevant in the High Energy Physics (HEP) world. Machine Learning (ML) and Deep Learning algorithms have been successfully used in many areas of HEP, like online and offline reconstruction programs, detector simulation, object reconstruction, identification, Monte Carlo generation, and surely they will be crucial in the HL-LHC phase. This thesis aims at contributing to a CMS R&D project, regarding a ML "as a Service" solution for HEP needs (MLaaS4HEP). It consists in a data-service able to perform an entire ML pipeline (in terms of reading data, processing data, training ML models, serving predictions) in a completely model-agnostic fashion, directly using ROOT files of arbitrary size from local or distributed data sources. This framework has been updated adding new features in the data preprocessing phase, allowing more flexibility to the user. Since the MLaaS4HEP framework is experiment agnostic, the ATLAS Higgs Boson ML challenge has been chosen as physics use case, with the aim to test MLaaS4HEP and the contribution done with this work.
Resumo:
This work presents the experimental development of a novel heat treatment for a high performance Laser Powder Bed Fusion Ti6Al4V alloy. Additive manufacturing production processes for titanium alloys are particularly of interest in cutting-edge engineering fields, however, high frequency laser induced thermal cycles generate a brittle as built microstructure. For this reason, heat treatments compliant with near net shape components are needed before their homologation and usage. The experimental campaign focused on the development of a multi-step heat treatment leading to a bilamellar microstructure. In fact, according to literature, such a microstructure should be promising in terms of mechanical properties both under static and cyclic loads. The heat treatment development has asked for the preliminary analyses of samples annealed and aged in laboratory, implementing several cycles, differing for what concerns temperatures, times and cooling rates. Such a characterization has been carried out through optical and electron microscopy analyses, image analyses, hardness and tensile tests. As a result, the most suitable thermal cycle has been selected and performed using industrial equipment on mini bending fatigue samples with different surface conditions. The same tests have been performed on a batch of traditionally treated samples, to provide with a comparison. This master thesis activity has finally led to the definition of a heat treatment resulting into a bilamellar microstructure, promising in terms of fatigue performances with respect to the traditionally treated alloy ones. The industrial implementation of such a heat treatment will require further improvements, particularly for what concerns the post annealing water quench, in order to prevent any surface alteration potentially responsible for the fatigue performances drop. Further development of the research may also include push-pull fatigue tests, crack grow propagation and residual stresses analyses.