18 resultados para High-Performance Computing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Electronic applications are nowadays converging under the umbrella of the cloud computing vision. The future ecosystem of information and communication technology is going to integrate clouds of portable clients and embedded devices exchanging information, through the internet layer, with processing clusters of servers, data-centers and high performance computing systems. Even thus the whole society is waiting to embrace this revolution, there is a backside of the story. Portable devices require battery to work far from the power plugs and their storage capacity does not scale as the increasing power requirement does. At the other end processing clusters, such as data-centers and server farms, are build upon the integration of thousands multiprocessors. For each of them during the last decade the technology scaling has produced a dramatic increase in power density with significant spatial and temporal variability. This leads to power and temperature hot-spots, which may cause non-uniform ageing and accelerated chip failure. Nonetheless all the heat removed from the silicon translates in high cooling costs. Moreover trend in ICT carbon footprint shows that run-time power consumption of the all spectrum of devices accounts for a significant slice of entire world carbon emissions. This thesis work embrace the full ICT ecosystem and dynamic power consumption concerns by describing a set of new and promising system levels resource management techniques to reduce the power consumption and related issues for two corner cases: Mobile Devices and High Performance Computing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis deals with heterogeneous architectures in standard workstations. Heterogeneous architectures represent an appealing alternative to traditional supercomputers because they are based on commodity components fabricated in large quantities. Hence their price-performance ratio is unparalleled in the world of high performance computing (HPC). In particular, different aspects related to the performance and consumption of heterogeneous architectures have been explored. The thesis initially focuses on an efficient implementation of a parallel application, where the execution time is dominated by an high number of floating point instructions. Then the thesis touches the central problem of efficient management of power peaks in heterogeneous computing systems. Finally it discusses a memory-bounded problem, where the execution time is dominated by the memory latency. Specifically, the following main contributions have been carried out: A novel framework for the design and analysis of solar field for Central Receiver Systems (CRS) has been developed. The implementation based on desktop workstation equipped with multiple Graphics Processing Units (GPUs) is motivated by the need to have an accurate and fast simulation environment for studying mirror imperfection and non-planar geometries. Secondly, a power-aware scheduling algorithm on heterogeneous CPU-GPU architectures, based on an efficient distribution of the computing workload to the resources, has been realized. The scheduler manages the resources of several computing nodes with a view to reducing the peak power. The two main contributions of this work follow: the approach reduces the supply cost due to high peak power whilst having negligible impact on the parallelism of computational nodes. from another point of view the developed model allows designer to increase the number of cores without increasing the capacity of the power supply unit. Finally, an implementation for efficient graph exploration on reconfigurable architectures is presented. The purpose is to accelerate graph exploration, reducing the number of random memory accesses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. High-Performance Computing (HPC) systems are an aggregation of computing power to deliver considerably higher performance than one typical desktop computer can provide, to solve large problems in science, engineering, or business. An HPC room in the datacenter is a complex controlled environment that hosts thousands of computing nodes that consume electrical power in the range of megawatts, which gets completely transformed into heat. Although a datacenter contains sophisticated cooling systems, our studies indicate quantitative evidence of thermal bottlenecks in real-life production workload, showing the presence of significant spatial and temporal thermal and power heterogeneity. Therefore minor thermal issues/anomalies can potentially start a chain of events that leads to an unbalance between the amount of heat generated by the computing nodes and the heat removed by the cooling system originating thermal hazards. Although thermal anomalies are rare events, anomaly detection/prediction in time is vital to avoid IT and facility equipment damage and outage of the datacenter, with severe societal and business losses. For this reason, automated approaches to detect thermal anomalies in datacenters have considerable potential. This thesis analyzed and characterized the power and thermal characteristics of a Tier0 datacenter (CINECA) during production and under abnormal thermal conditions. Then, a Deep Learning (DL)-powered thermal hazard prediction framework is proposed. The proposed models are validated against real thermal hazard events reported for the studied HPC cluster while in production. This thesis is the first empirical study of thermal anomaly detection and prediction techniques of a real large-scale HPC system to the best of my knowledge. For this thesis, I used a large-scale dataset, monitoring data of tens of thousands of sensors for around 24 months with a data collection rate of around 20 seconds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The need for high bandwidth, due to the explosion of new multi\-media-oriented IP-based services, as well as increasing broadband access requirements is leading to the need of flexible and highly reconfigurable optical networks. While transmission bandwidth does not represent a limit due to the huge bandwidth provided by optical fibers and Dense Wavelength Division Multiplexing (DWDM) technology, the electronic switching nodes in the core of the network represent the bottleneck in terms of speed and capacity for the overall network. For this reason DWDM technology must be exploited not only for data transport but also for switching operations. In this Ph.D. thesis solutions for photonic packet switches, a flexible alternative with respect to circuit-switched optical networks are proposed. In particular solutions based on devices and components that are expected to mature in the near future are proposed, with the aim to limit the employment of complex components. The work presented here is the result of part of the research activities performed by the Networks Research Group at the Department of Electronics, Computer Science and Systems (DEIS) of the University of Bologna, Italy. In particular, the work on optical packet switching has been carried on within three relevant research projects: the e-Photon/ONe and e-Photon/ONe+ projects, funded by the European Union in the Sixth Framework Programme, and the national project OSATE funded by the Italian Ministry of Education, University and Scientific Research. The rest of the work is organized as follows. Chapter 1 gives a brief introduction to network context and contention resolution in photonic packet switches. Chapter 2 presents different strategies for contention resolution in wavelength domain. Chapter 3 illustrates a possible implementation of one of the schemes proposed in chapter 2. Then, chapter 4 presents multi-fiber switches, which employ jointly wavelength and space domains to solve contention. Chapter 5 shows buffered switches, to solve contention in time domain besides wavelength domain. Finally chapter 6 presents a cost model to compare different switch architectures in terms of cost.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The first part of this thesis has focused on the construction of a twelve-phase asynchronous machine for More Electric Aircraft (MEA) applications. In fact, the aerospace world has found in electrification the way to improve the efficiency, reliability and maintainability of an aircraft. This idea leads to the aircraft a new management and distribution of electrical services. In this way is possible to remove or to reduce the hydraulic, mechanical and pneumatic systems inside the aircraft. The second part of this dissertation is dedicated on the enhancement of the control range of matrix converters (MCs) operating with non-unity input power factor and, at the same time, on the reduction of the switching power losses. The analysis leads to the determination in closed form of a modulation strategy that features a control range, in terms of output voltage and input power factor, that is greater than that of the traditional strategies under the same operating conditions, and a reduction in the switching power losses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasingly strict regulations on greenhouse gas emissions make the fuel economy a pressing factor for automotive manufacturers. Lightweighting and engine downsizing are two strategies pursued to achieve the target. In this context, materials play a key role since these limit the engine efficiency and components weight, due to their acceptable thermo-mechanical loads. Piston is one of the most stressed engine components and it is traditionally made of Al alloys, whose weakness is to maintain adequate mechanical properties at high temperature due to overaging and softening. The enhancement in strength-to-weight ratio at high temperature of Al alloys had been investigated through two approaches: increase of strength at high temperature or reduction of the alloy density. Several conventional and high performance Al-Si and Al-Cu alloys have been characterized from a microstructural and mechanical point of view, investigating the effects of chemical composition, addition of transition elements and heat treatment optimization, in the specific temperature range for pistons operations. Among the Al-Cu alloys, the research outlines the potentialities of two innovative Al-Cu-Li(-Ag) alloys, typically adopted for structural aerospace components. Moreover, due to the increased probability of abnormal combustions in high performance spark-ignition engines, the second part of the dissertation deals with the study of knocking damages on Al pistons. Thanks to the cooperation with Ferrari S.p.A. and Fluid Machinery Research Group - Unibo, several bench tests have been carried out under controlled knocking conditions. Knocking damage mechanisms were investigated through failure analyses techniques, starting from visual analysis up to detailed SEM investigations. These activities allowed to relate piston knocking damage to engine parameters, with the final aim to develop an on-board knocking controller able to increase engine efficiency, without compromising engine functionality. Finally, attempts have been made to quantify the knock-induced damages, to provide a numerical relation with engine working conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The scope of this dissertation is to study the transport phenomena of small molecules in polymers and membranes for gas separation applications, with particular attention to energy efficiency and environmental sustainability. This work seeks to contribute to the development of new competitive selective materials through the characterization of novel organic polymers such as CANALs and ROMPs, as well as through the combination of selective materials obtaining mixed matrix membranes (MMMs), to make membrane technologies competitive with the traditional ones. Kinetic and thermodynamic aspects of the transport properties were investigated in ideal and non-ideal scenarios, such as mixed-gas experiments. The information we gathered contributed to the development of the fundamental understanding related to phenomenon like CO2-induced plasticization and physical aging. Among the most significant results, ZIF-8/PPO MMMs provided materials whose permeability and selectivity were higher than those of the pure materials for He/CO2 separation. The CANALs featured norbornyl benzocyclobutene backbone and thereby introduced a third typology of ladder polymers in the gas separation field, expanding the structural diversity of microporous materials. CANALs have a completely hydrocarbon-based and non-polar rigid backbone, which makes them an ideal model system to investigate structure-property correlations. ROMPs were synthesized by means of the ring opening metathesis living polymerization, which allowed the formation of bottlebrush polymers. CF3-ROMP reveled to be ultrapermeable to CO2, with unprecedented plasticization resistance properties. Mixed-gas experiments in glassy polymer showed that solubility-selectivity controls the separation efficiency of materials in multicomponent conditions. Finally, it was determined that plasticization pressure in not an intrinsic property of a material and does not represent a state of the system, but rather comes from the contribution of solubility coefficient and diffusivity coefficient in the framework of the solution-diffusion model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High Energy efficiency and high performance are the key regiments for Internet of Things (IoT) end-nodes. Exploiting cluster of multiple programmable processors has recently emerged as a suitable solution to address this challenge. However, one of the main bottlenecks for multi-core architectures is the instruction cache. While private caches fall into data replication and wasting area, fully shared caches lack scalability and form a bottleneck for the operating frequency. Hence we propose a hybrid solution where a larger shared cache (L1.5) is shared by multiple cores connected through a low-latency interconnect to small private caches (L1). However, it is still limited by large capacity miss with a small L1. Thus, we propose a sequential prefetch from L1 to L1.5 to improve the performance with little area overhead. Moreover, to cut the critical path for better timing, we optimized the core instruction fetch stage with non-blocking transfer by adopting a 4 x 32-bit ring buffer FIFO and adding a pipeline for the conditional branch. We present a detailed comparison of different instruction cache architectures' performance and energy efficiency recently proposed for Parallel Ultra-Low-Power clusters. On average, when executing a set of real-life IoT applications, our two-level cache improves the performance by up to 20% and loses 7% energy efficiency with respect to the private cache. Compared to a shared cache system, it improves performance by up to 17% and keeps the same energy efficiency. In the end, up to 20% timing (maximum frequency) improvement and software control enable the two-level instruction cache with prefetch adapt to various battery-powered usage cases to balance high performance and energy efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This Doctoral Thesis aims to study and develop advanced and high-efficient battery chargers for full electric and plug-in electric cars. The document is strictly industry-oriented and relies on automotive standards and regulations. In the first part a general overview about wireless power transfer battery chargers (WPTBCs) and a deep investigation about international standards are carried out. Then, due to the highly increasing attention given to WPTBCs by the automotive industry and considering the need of minimizing weight, size and number of components this work focuses on those architectures that realize a single stage for on-board power conversion avoiding the implementation of the DC/DC converter upstream the battery. Based on the results of the state-of-the-art, the following sections focus on two stages of the architecture: the resonant tank and the primary DC/AC inverter. To reach the maximum transfer efficiency while minimizing weight and size of the vehicle assembly a coordinated system level design procedure for resonant tank along with an innovative control algorithm for the DC/AC primary inverter is proposed. The presented solutions are generalized and adapted for the best trade-off topologies of compensation networks: Series-Series and Series-Parallel. To assess the effectiveness of the above-mentioned objectives, validation and testing are performed through a simulation environment, while experimental test benches are carried out by the collaboration of Delft University of Technology (TU Delft).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work deals with the development of calibration procedures and control systems to improve the performance and efficiency of modern spark ignition turbocharged engines. The algorithms developed are used to optimize and manage the spark advance and the air-to-fuel ratio to control the knock and the exhaust gas temperature at the turbine inlet. The described work falls within the activity that the research group started in the previous years with the industrial partner Ferrari S.p.a. . The first chapter deals with the development of a control-oriented engine simulator based on a neural network approach, with which the main combustion indexes can be simulated. The second chapter deals with the development of a procedure to calibrate offline the spark advance and the air-to-fuel ratio to run the engine under knock-limited conditions and with the maximum admissible exhaust gas temperature at the turbine inlet. This procedure is then converted into a model-based control system and validated with a Software in the Loop approach using the engine simulator developed in the first chapter. Finally, it is implemented in a rapid control prototyping hardware to manage the combustion in steady-state and transient operating conditions at the test bench. The third chapter deals with the study of an innovative and cheap sensor for the in-cylinder pressure measurement, which is a piezoelectric washer that can be installed between the spark plug and the engine head. The signal generated by this kind of sensor is studied, developing a specific algorithm to adjust the value of the knock index in real-time. Finally, with the engine simulator developed in the first chapter, it is demonstrated that the innovative sensor can be coupled with the control system described in the second chapter and that the performance obtained could be the same reachable with the standard in-cylinder pressure sensors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research activity aims at providing a reliable estimation of particular state variables or parameters concerning the dynamics and performance optimization of a MotoGP-class motorcycle, integrating the classical model-based approach with new methodologies involving artificial intelligence. The first topic of the research focuses on the estimation of the thermal behavior of the MotoGP carbon braking system. Numerical tools are developed to assess the instantaneous surface temperature distribution in the motorcycle's front brake discs. Within this application other important brake parameters are identified using Kalman filters, such as the disc convection coefficient and the power distribution in the disc-pads contact region. Subsequently, a physical model of the brake is built to estimate the instantaneous braking torque. However, the results obtained with this approach are highly limited by the knowledge of the friction coefficient (μ) between the disc rotor and the pads. Since the value of μ is a highly nonlinear function of many variables (namely temperature, pressure and angular velocity of the disc), an analytical model for the friction coefficient estimation appears impractical to establish. To overcome this challenge, an innovative hybrid solution is implemented, combining the benefit of artificial intelligence (AI) with classical model-based approach. Indeed, the disc temperature estimated through the thermal model previously implemented is processed by a machine learning algorithm that outputs the actual value of the friction coefficient thus improving the braking torque computation performed by the physical model of the brake. Finally, the last topic of this research activity regards the development of an AI algorithm to estimate the current sideslip angle of the motorcycle's front tire. While a single-track motorcycle kinematic model and IMU accelerometer signals theoretically enable sideslip calculation, the presence of accelerometer noise leads to a significant drift over time. To address this issue, a long short-term memory (LSTM) network is implemented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A High-Performance Computing job dispatcher is a critical software that assigns the finite computing resources to submitted jobs. This resource assignment over time is known as the on-line job dispatching problem in HPC systems. The fact the problem is on-line means that solutions must be computed in real-time, and their required time cannot exceed some threshold to do not affect the normal system functioning. In addition, a job dispatcher must deal with a lot of uncertainty: submission times, the number of requested resources, and duration of jobs. Heuristic-based techniques have been broadly used in HPC systems, at the cost of achieving (sub-)optimal solutions in a short time. However, the scheduling and resource allocation components are separated, thus generates a decoupled decision that may cause a performance loss. Optimization-based techniques are less used for this problem, although they can significantly improve the performance of HPC systems at the expense of higher computation time. Nowadays, HPC systems are being used for modern applications, such as big data analytics and predictive model building, that employ, in general, many short jobs. However, this information is unknown at dispatching time, and job dispatchers need to process large numbers of them quickly while ensuring high Quality-of-Service (QoS) levels. Constraint Programming (CP) has been shown to be an effective approach to tackle job dispatching problems. However, state-of-the-art CP-based job dispatchers are unable to satisfy the challenges of on-line dispatching, such as generate dispatching decisions in a brief period and integrate current and past information of the housing system. Given the previous reasons, we propose CP-based dispatchers that are more suitable for HPC systems running modern applications, generating on-line dispatching decisions in a proper time and are able to make effective use of job duration predictions to improve QoS levels, especially for workloads dominated by short jobs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation proposes an analysis of the governance of the European scientific research, focusing on the emergence of the Open Science paradigm: a new way of doing science, oriented towards the openness of every phase of the scientific research process, able to take full advantage of the digital ICTs. The emergence of this paradigm is relatively recent, but in the last years it has become increasingly relevant. The European institutions expressed a clear intention to embrace the Open Science paradigm (eg., think about the European Open Science Cloud, EOSC; or the establishment of the Horizon Europe programme). This dissertation provides a conceptual framework for the multiple interventions of the European institutions in the field of Open Science, addressing the major legal challenges of its implementation. The study investigates the notion of Open Science, proposing a definition that takes into account all its dimensions related to the human and fundamental rights framework in which Open Science is grounded. The inquiry addresses the legal challenges related to the openness of research data, in light of the European Open Data framework and the impact of the GDPR on the context of Open Science. The last part of the study is devoted to the infrastructural dimension of the Open Science paradigm, exploring the e-infrastructures. The focus is on a specific type of computational infrastructure: the High Performance Computing (HPC) facility. The adoption of HPC for research is analysed from the European perspective, investigating the EuroHPC project, and the local perspective, proposing the case study of the HPC facility of the University of Luxembourg, the ULHPC. This dissertation intends to underline the relevance of the legal coordination approach, between all actors and phases of the process, in order to develop and implement the Open Science paradigm, adhering to the underlying human and fundamental rights.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nowadays, computing is migrating from traditional high performance and distributed computing to pervasive and utility computing based on heterogeneous networks and clients. The current trend suggests that future IT services will rely on distributed resources and on fast communication of heterogeneous contents. The success of this new range of services is directly linked to the effectiveness of the infrastructure in delivering them. The communication infrastructure will be the aggregation of different technologies even though the current trend suggests the emergence of single IP based transport service. Optical networking is a key technology to answer the increasing requests for dynamic bandwidth allocation and configure multiple topologies over the same physical layer infrastructure, optical networks today are still “far” from accessible from directly configure and offer network services and need to be enriched with more “user oriented” functionalities. However, current Control Plane architectures only facilitate efficient end-to-end connectivity provisioning and certainly cannot meet future network service requirements, e.g. the coordinated control of resources. The overall objective of this work is to provide the network with the improved usability and accessibility of the services provided by the Optical Network. More precisely, the definition of a service-oriented architecture is the enable technology to allow user applications to gain benefit of advanced services over an underlying dynamic optical layer. The definition of a service oriented networking architecture based on advanced optical network technologies facilitates users and applications access to abstracted levels of information regarding offered advanced network services. This thesis faces the problem to define a Service Oriented Architecture and its relevant building blocks, protocols and languages. In particular, this work has been focused on the use of the SIP protocol as a inter-layers signalling protocol which defines the Session Plane in conjunction with the Network Resource Description language. On the other hand, an advantage optical network must accommodate high data bandwidth with different granularities. Currently, two main technologies are emerging promoting the development of the future optical transport network, Optical Burst and Packet Switching. Both technologies respectively promise to provide all optical burst or packet switching instead of the current circuit switching. However, the electronic domain is still present in the scheduler forwarding and routing decision. Because of the high optics transmission frequency the burst or packet scheduler faces a difficult challenge, consequentially, high performance and time focused design of both memory and forwarding logic is need. This open issue has been faced in this thesis proposing an high efficiently implementation of burst and packet scheduler. The main novelty of the proposed implementation is that the scheduling problem has turned into simple calculation of a min/max function and the function complexity is almost independent of on the traffic conditions.