6 resultados para Graphics hardware

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

30.00% 30.00%

Publicador:

Resumo:

During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ALICE, that is an experiment held at CERN using the LHC, is specialized in analyzing lead-ion collisions. ALICE will study the properties of quarkgluon plasma, a state of matter where quarks and gluons, under conditions of very high temperatures and densities, are no longer confined inside hadrons. Such a state of matter probably existed just after the Big Bang, before particles such as protons and neutrons were formed. The SDD detector, one of the ALICE subdetectors, is part of the ITS that is composed by 6 cylindrical layers with the innermost one attached to the beam pipe. The ITS tracks and identifies particles near the interaction point, it also aligns the tracks of the articles detected by more external detectors. The two ITS middle layers contain the whole 260 SDD detectors. A multichannel readout board, called CARLOSrx, receives at the same time the data coming from 12 SDD detectors. In total there are 24 CARLOSrx boards needed to read data coming from all the SDD modules (detector plus front end electronics). CARLOSrx packs data coming from the front end electronics through optical link connections, it stores them in a large data FIFO and then it sends them to the DAQ system. Each CARLOSrx is composed by two boards. One is called CARLOSrx data, that reads data coming from the SDD detectors and configures the FEE; the other one is called CARLOSrx clock, that sends the clock signal to all the FEE. This thesis contains a description of the hardware design and firmware features of both CARLOSrx data and CARLOSrx clock boards, which deal with all the SDD readout chain. A description of the software tools necessary to test and configure the front end electronics will be presented at the end of the thesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work describes the development of a simulation tool which allows the simulation of the Internal Combustion Engine (ICE), the transmission and the vehicle dynamics. It is a control oriented simulation tool, designed in order to perform both off-line (Software In the Loop) and on-line (Hardware In the Loop) simulation. In the first case the simulation tool can be used in order to optimize Engine Control Unit strategies (as far as regard, for example, the fuel consumption or the performance of the engine), while in the second case it can be used in order to test the control system. In recent years the use of HIL simulations has proved to be very useful in developing and testing of control systems. Hardware In the Loop simulation is a technology where the actual vehicles, engines or other components are replaced by a real time simulation, based on a mathematical model and running in a real time processor. The processor reads ECU (Engine Control Unit) output signals which would normally feed the actuators and, by using mathematical models, provides the signals which would be produced by the actual sensors. The simulation tool, fully designed within Simulink, includes the possibility to simulate the only engine, the transmission and vehicle dynamics and the engine along with the vehicle and transmission dynamics, allowing in this case to evaluate the performance and the operating conditions of the Internal Combustion Engine, once it is installed on a given vehicle. Furthermore the simulation tool includes different level of complexity, since it is possible to use, for example, either a zero-dimensional or a one-dimensional model of the intake system (in this case only for off-line application, because of the higher computational effort). Given these preliminary remarks, an important goal of this work is the development of a simulation environment that can be easily adapted to different engine types (single- or multi-cylinder, four-stroke or two-stroke, diesel or gasoline) and transmission architecture without reprogramming. Also, the same simulation tool can be rapidly configured both for off-line and real-time application. The Matlab-Simulink environment has been adopted to achieve such objectives, since its graphical programming interface allows building flexible and reconfigurable models, and real-time simulation is possible with standard, off-the-shelf software and hardware platforms (such as dSPACE systems).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The evolution of the electronics embedded applications forces electronics systems designers to match their ever increasing requirements. This evolution pushes the computational power of digital signal processing systems, as well as the energy required to accomplish the computations, due to the increasing mobility of such applications. Current approaches used to match these requirements relies on the adoption of application specific signal processors. Such kind of devices exploits powerful accelerators, which are able to match both performance and energy requirements. On the other hand, the too high specificity of such accelerators often results in a lack of flexibility which affects non-recurrent engineering costs, time to market, and market volumes too. The state of the art mainly proposes two solutions to overcome these issues with the ambition of delivering reasonable performance and energy efficiency: reconfigurable computing and multi-processors computing. All of these solutions benefits from the post-fabrication programmability, that definitively results in an increased flexibility. Nevertheless, the gap between these approaches and dedicated hardware is still too high for many application domains, especially when targeting the mobile world. In this scenario, flexible and energy efficient acceleration can be achieved by merging these two computational paradigms, in order to address all the above introduced constraints. This thesis focuses on the exploration of the design and application spectrum of reconfigurable computing, exploited as application specific accelerators for multi-processors systems on chip. More specifically, it introduces a reconfigurable digital signal processor featuring a heterogeneous set of reconfigurable engines, and a homogeneous multi-core system, exploiting three different flavours of reconfigurable and mask-programmable technologies as implementation platform for applications specific accelerators. In this work, the various trade-offs concerning the utilization multi-core platforms and the different configuration technologies are explored, characterizing the design space of the proposed approach in terms of programmability, performance, energy efficiency and manufacturing costs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The new generation of multicore processors opens new perspectives for the design of embedded systems. Multiprocessing, however, poses new challenges to the scheduling of real-time applications, in which the ever-increasing computational demands are constantly flanked by the need of meeting critical time constraints. Many research works have contributed to this field introducing new advanced scheduling algorithms. However, despite many of these works have solidly demonstrated their effectiveness, the actual support for multiprocessor real-time scheduling offered by current operating systems is still very limited. This dissertation deals with implementative aspects of real-time schedulers in modern embedded multiprocessor systems. The first contribution is represented by an open-source scheduling framework, which is capable of realizing complex multiprocessor scheduling policies, such as G-EDF, on conventional operating systems exploiting only their native scheduler from user-space. A set of experimental evaluations compare the proposed solution to other research projects that pursue the same goals by means of kernel modifications, highlighting comparable scheduling performances. The principles that underpin the operation of the framework, originally designed for symmetric multiprocessors, have been further extended first to asymmetric ones, which are subjected to major restrictions such as the lack of support for task migrations, and later to re-programmable hardware architectures (FPGAs). In the latter case, this work introduces a scheduling accelerator, which offloads most of the scheduling operations to the hardware and exhibits extremely low scheduling jitter. The realization of a portable scheduling framework presented many interesting software challenges. One of these has been represented by timekeeping. In this regard, a further contribution is represented by a novel data structure, called addressable binary heap (ABH). Such ABH, which is conceptually a pointer-based implementation of a binary heap, shows very interesting average and worst-case performances when addressing the problem of tick-less timekeeping of high-resolution timers.