2 resultados para Exascale, Supercomputer,OFET,energy effincency, data locality, HPC

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern embedded systems embrace many-core shared-memory designs. Due to constrained power and area budgets, most of them feature software-managed scratchpad memories instead of data caches to increase the data locality. It is therefore programmers’ responsibility to explicitly manage the memory transfers, and this make programming these platform cumbersome. Moreover, complex modern applications must be adequately parallelized before they can the parallel potential of the platform into actual performance. To support this, programming languages were proposed, which work at a high level of abstraction, and rely on a runtime whose cost hinders performance, especially in embedded systems, where resources and power budget are constrained. This dissertation explores the applicability of the shared-memory paradigm on modern many-core systems, focusing on the ease-of-programming. It focuses on OpenMP, the de-facto standard for shared memory programming. In a first part, the cost of algorithms for synchronization and data partitioning are analyzed, and they are adapted to modern embedded many-cores. Then, the original design of an OpenMP runtime library is presented, which supports complex forms of parallelism such as multi-level and irregular parallelism. In the second part of the thesis, the focus is on heterogeneous systems, where hardware accelerators are coupled to (many-)cores to implement key functional kernels with orders-of-magnitude of speedup and energy efficiency compared to the “pure software” version. However, three main issues rise, namely i) platform design complexity, ii) architectural scalability and iii) programmability. To tackle them, a template for a generic hardware processing unit (HWPU) is proposed, which share the memory banks with cores, and the template for a scalable architecture is shown, which integrates them through the shared-memory system. Then, a full software stack and toolchain are developed to support platform design and to let programmers exploiting the accelerators of the platform. The OpenMP frontend is extended to interact with it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Theoretical models are developed for the continuous-wave and pulsed laser incision and cut of thin single and multi-layer films. A one-dimensional steady-state model establishes the theoretical foundations of the problem by combining a power-balance integral with heat flow in the direction of laser motion. In this approach, classical modelling methods for laser processing are extended by introducing multi-layer optical absorption and thermal properties. The calculation domain is consequently divided in correspondence with the progressive removal of individual layers. A second, time-domain numerical model for the short-pulse laser ablation of metals accounts for changes in optical and thermal properties during a single laser pulse. With sufficient fluence, the target surface is heated towards its critical temperature and homogeneous boiling or "phase explosion" takes place. Improvements are seen over previous works with the more accurate calculation of optical absorption and shielding of the incident beam by the ablation products. A third, general time-domain numerical laser processing model combines ablation depth and energy absorption data from the short-pulse model with two-dimensional heat flow in an arbitrary multi-layer structure. Layer removal is the result of both progressive short-pulse ablation and classical vaporisation due to long-term heating of the sample. At low velocity, pulsed laser exposure of multi-layer films comprising aluminium-plastic and aluminium-paper are found to be characterised by short-pulse ablation of the metallic layer and vaporisation or degradation of the others due to thermal conduction from the former. At high velocity, all layers of the two films are ultimately removed by vaporisation or degradation as the average beam power is increased to achieve a complete cut. The transition velocity between the two characteristic removal types is shown to be a function of the pulse repetition rate. An experimental investigation validates the simulation results and provides new laser processing data for some typical packaging materials.