5 resultados para Efficient implementation
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.
Resumo:
Over the last 60 years, computers and software have favoured incredible advancements in every field. Nowadays, however, these systems are so complicated that it is difficult – if not challenging – to understand whether they meet some requirement or are able to show some desired behaviour or property. This dissertation introduces a Just-In-Time (JIT) a posteriori approach to perform the conformance check to identify any deviation from the desired behaviour as soon as possible, and possibly apply some corrections. The declarative framework that implements our approach – entirely developed on the promising open source forward-chaining Production Rule System (PRS) named Drools – consists of three components: 1. a monitoring module based on a novel, efficient implementation of Event Calculus (EC), 2. a general purpose hybrid reasoning module (the first of its genre) merging temporal, semantic, fuzzy and rule-based reasoning, 3. a logic formalism based on the concept of expectations introducing Event-Condition-Expectation rules (ECE-rules) to assess the global conformance of a system. The framework is also accompanied by an optional module that provides Probabilistic Inductive Logic Programming (PILP). By shifting the conformance check from after execution to just in time, this approach combines the advantages of many a posteriori and a priori methods proposed in literature. Quite remarkably, if the corrective actions are explicitly given, the reactive nature of this methodology allows to reconcile any deviations from the desired behaviour as soon as it is detected. In conclusion, the proposed methodology brings some advancements to solve the problem of the conformance checking, helping to fill the gap between humans and the increasingly complex technology.
Resumo:
This thesis deals with heterogeneous architectures in standard workstations. Heterogeneous architectures represent an appealing alternative to traditional supercomputers because they are based on commodity components fabricated in large quantities. Hence their price-performance ratio is unparalleled in the world of high performance computing (HPC). In particular, different aspects related to the performance and consumption of heterogeneous architectures have been explored. The thesis initially focuses on an efficient implementation of a parallel application, where the execution time is dominated by an high number of floating point instructions. Then the thesis touches the central problem of efficient management of power peaks in heterogeneous computing systems. Finally it discusses a memory-bounded problem, where the execution time is dominated by the memory latency. Specifically, the following main contributions have been carried out: A novel framework for the design and analysis of solar field for Central Receiver Systems (CRS) has been developed. The implementation based on desktop workstation equipped with multiple Graphics Processing Units (GPUs) is motivated by the need to have an accurate and fast simulation environment for studying mirror imperfection and non-planar geometries. Secondly, a power-aware scheduling algorithm on heterogeneous CPU-GPU architectures, based on an efficient distribution of the computing workload to the resources, has been realized. The scheduler manages the resources of several computing nodes with a view to reducing the peak power. The two main contributions of this work follow: the approach reduces the supply cost due to high peak power whilst having negligible impact on the parallelism of computational nodes. from another point of view the developed model allows designer to increase the number of cores without increasing the capacity of the power supply unit. Finally, an implementation for efficient graph exploration on reconfigurable architectures is presented. The purpose is to accelerate graph exploration, reducing the number of random memory accesses.
Resumo:
Dynamical models of stellar systems represent a powerful tool to study their internal structure and dynamics, to interpret the observed morphological and kinematical fields, and also to support numerical simulations of their evolution. We present a method especially designed to build axisymmetric Jeans models of galaxies, assumed as stationary and collisionless stellar systems. The aim is the development of a rigorous and flexible modelling procedure of multicomponent galaxies, composed of different stellar and dark matter distributions, and a central supermassive black hole. The stellar components, in particular, are intended to represent different galaxy structures, such as discs, bulges, halos, and can then have different structural (density profile, flattening, mass, scale-length), dynamical (rotation, velocity dispersion anisotropy), and population (age, metallicity, initial mass function, mass-to-light ratio) properties. The theoretical framework supporting the modelling procedure is presented, with the introduction of a suitable nomenclature, and its numerical implementation is discussed, with particular reference to the numerical code JASMINE2, developed for this purpose. We propose an approach for efficiently scaling the contributions in mass, luminosity, and rotational support, of the different matter components, allowing for fast and flexible explorations of the model parameter space. We also offer different methods of the computation of the gravitational potentials associated of the density components, especially convenient for their easier numerical tractability. A few galaxy models are studied, showing internal, and projected, structural and dynamical properties of multicomponent galaxies, with a focus on axisymmetric early-type galaxies with complex kinematical morphologies. The application of galaxy models to the study of initial conditions for hydro-dynamical and $N$-body simulations of galaxy evolution is also addressed, allowing in particular to investigate the large number of interesting combinations of the parameters which determine the structure and dynamics of complex multicomponent stellar systems.
Resumo:
Spectral sensors are a wide class of devices that are extremely useful for detecting essential information of the environment and materials with high degree of selectivity. Recently, they have achieved high degrees of integration and low implementation cost to be suited for fast, small, and non-invasive monitoring systems. However, the useful information is hidden in spectra and it is difficult to decode. So, mathematical algorithms are needed to infer the value of the variables of interest from the acquired data. Between the different families of predictive modeling, Principal Component Analysis and the techniques stemmed from it can provide very good performances, as well as small computational and memory requirements. For these reasons, they allow the implementation of the prediction even in embedded and autonomous devices. In this thesis, I will present 4 practical applications of these algorithms to the prediction of different variables: moisture of soil, moisture of concrete, freshness of anchovies/sardines, and concentration of gasses. In all of these cases, the workflow will be the same. Initially, an acquisition campaign was performed to acquire both spectra and the variables of interest from samples. Then these data are used as input for the creation of the prediction models, to solve both classification and regression problems. From these models, an array of calibration coefficients is derived and used for the implementation of the prediction in an embedded system. The presented results will show that this workflow was successfully applied to very different scientific fields, obtaining autonomous and non-invasive devices able to predict the value of physical parameters of choice from new spectral acquisitions.