951 resultados para Many-core systems


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent technological advancements and market trends are causing an interesting phenomenon towards the convergence of High-Performance Computing (HPC) and Embedded Computing (EC) domains. On one side, new kinds of HPC applications are being required by markets needing huge amounts of information to be processed within a bounded amount of time. On the other side, EC systems are increasingly concerned with providing higher performance in real-time, challenging the performance capabilities of current architectures. The advent of next-generation many-core embedded platforms has the chance of intercepting this converging need for predictable high-performance, allowing HPC and EC applications to be executed on efficient and powerful heterogeneous architectures integrating general-purpose processors with many-core computing fabrics. To this end, it is of paramount importance to develop new techniques for exploiting the massively parallel computation capabilities of such platforms in a predictable way. P-SOCRATES will tackle this important challenge by merging leading research groups from the HPC and EC communities. The time-criticality and parallelisation challenges common to both areas will be addressed by proposing an integrated framework for executing workload-intensive applications with real-time requirements on top of next-generation commercial-off-the-shelf (COTS) platforms based on many-core accelerated architectures. The project will investigate new HPC techniques that fulfil real-time requirements. The main sources of indeterminism will be identified, proposing efficient mapping and scheduling algorithms, along with the associated timing and schedulability analysis, to guarantee the real-time and performance requirements of the applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite the several issues faced in the past, the evolutionary trend of silicon has kept its constant pace. Today an ever increasing number of cores is integrated onto the same die. Unfortunately, the extraordinary performance achievable by the many-core paradigm is limited by several factors. Memory bandwidth limitation, combined with inefficient synchronization mechanisms, can severely overcome the potential computation capabilities. Moreover, the huge HW/SW design space requires accurate and flexible tools to perform architectural explorations and validation of design choices. In this thesis we focus on the aforementioned aspects: a flexible and accurate Virtual Platform has been developed, targeting a reference many-core architecture. Such tool has been used to perform architectural explorations, focusing on instruction caching architecture and hybrid HW/SW synchronization mechanism. Beside architectural implications, another issue of embedded systems is considered: energy efficiency. Near Threshold Computing is a key research area in the Ultra-Low-Power domain, as it promises a tenfold improvement in energy efficiency compared to super-threshold operation and it mitigates thermal bottlenecks. The physical implications of modern deep sub-micron technology are severely limiting performance and reliability of modern designs. Reliability becomes a major obstacle when operating in NTC, especially memory operation becomes unreliable and can compromise system correctness. In the present work a novel hybrid memory architecture is devised to overcome reliability issues and at the same time improve energy efficiency by means of aggressive voltage scaling when allowed by workload requirements. Variability is another great drawback of near-threshold operation. The greatly increased sensitivity to threshold voltage variations in today a major concern for electronic devices. We introduce a variation-tolerant extension of the baseline many-core architecture. By means of micro-architectural knobs and a lightweight runtime control unit, the baseline architecture becomes dynamically tolerant to variations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The current approach to developing mixed-criticality sys- tems is by partitioning the hardware resources (processors, memory and I/O devices) among the different applications. Partitions are isolated from each other both in the temporal and the spatial domain, so that low-criticality applications cannot compromise other applications with a higher level of criticality in case of misbehaviour. New architectures based on many-core processors open the way to highly parallel systems in which each partition can be allocated to a set of dedicated proces- sor cores, thus simplifying partition scheduling and temporal separation. Moreover, spatial isolation can also benefit from many-core architectures, by using simpler hardware mechanisms to protect the address spaces of different applications. This paper describes an architecture for many- core embedded partitioned systems, together with some implementation advice for spatial isolation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An abstract of this work will be presented at the Compiler, Architecture and Tools Conference (CATC), Intel Development Center, Haifa, Israel November 23, 2015.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces hybrid address spaces as a fundamental design methodology for implementing scalable runtime systems on many-core architectures without hardware support for cache coherence. We use hybrid address spaces for an implementation of MapReduce, a programming model for large-scale data processing, and the implementation of a remote memory access (RMA) model. Both implementations are available on the Intel SCC and are portable to similar architectures. We present the design and implementation of HyMR, a MapReduce runtime system whereby different stages and the synchronization operations between them alternate between a distributed memory address space and a shared memory address space, to improve performance and scalability. We compare HyMR to a reference implementation and we find that HyMR improves performance by a factor of 1.71× over a set of representative MapReduce benchmarks. We also compare HyMR with Phoenix++, a state-of-art implementation for systems with hardware-managed cache coherence in terms of scalability and sustained to peak data processing bandwidth, where HyMR demon- strates improvements of a factor of 3.1× and 3.2× respectively. We further evaluate our hybrid remote memory access (HyRMA) programming model and assess its performance to be superior of that of message passing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the reinsurance market, the risks natural catastrophes pose to portfolios of properties must be quantified, so that they can be priced, and insurance offered. The analysis of such risks at a portfolio level requires a simulation of up to 800 000 trials with an average of 1000 catastrophic events per trial. This is sufficient to capture risk for a global multi-peril reinsurance portfolio covering a range of perils including earthquake, hurricane, tornado, hail, severe thunderstorm, wind storm, storm surge and riverine flooding, and wildfire. Such simulations are both computation and data intensive, making the application of high-performance computing techniques desirable.

In this paper, we explore the design and implementation of portfolio risk analysis on both multi-core and many-core computing platforms. Given a portfolio of property catastrophe insurance treaties, key risk measures, such as probable maximum loss, are computed by taking both primary and secondary uncertainties into account. Primary uncertainty is associated with whether or not an event occurs in a simulated year, while secondary uncertainty captures the uncertainty in the level of loss due to the use of simplified physical models and limitations in the available data. A combination of fast lookup structures, multi-threading and careful hand tuning of numerical operations is required to achieve good performance. Experimental results are reported for multi-core processors and systems using NVIDIA graphics processing unit and Intel Phi many-core accelerators.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, I will discuss how information-theoretic arguments can be used to produce sharp bounds in the studies of quantum many-body systems. The main advantage of this approach, as opposed to the conventional field-theoretic argument, is that it depends very little on the precise form of the Hamiltonian. The main idea behind this thesis lies on a number of results concerning the structure of quantum states that are conditionally independent. Depending on the application, some of these statements are generalized to quantum states that are approximately conditionally independent. These structures can be readily used in the studies of gapped quantum many-body systems, especially for the ones in two spatial dimensions. A number of rigorous results are derived, including (i) a universal upper bound for a maximal number of topologically protected states that is expressed in terms of the topological entanglement entropy, (ii) a first-order perturbation bound for the topological entanglement entropy that decays superpolynomially with the size of the subsystem, and (iii) a correlation bound between an arbitrary local operator and a topological operator constructed from a set of local reduced density matrices. I also introduce exactly solvable models supported on a three-dimensional lattice that can be used as a reliable quantum memory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Disorder and interactions both play crucial roles in quantum transport. Decades ago, Mott showed that electron-electron interactions can lead to insulating behavior in materials that conventional band theory predicts to be conducting. Soon thereafter, Anderson demonstrated that disorder can localize a quantum particle through the wave interference phenomenon of Anderson localization. Although interactions and disorder both separately induce insulating behavior, the interplay of these two ingredients is subtle and often leads to surprising behavior at the periphery of our current understanding. Modern experiments probe these phenomena in a variety of contexts (e.g. disordered superconductors, cold atoms, photonic waveguides, etc.); thus, theoretical and numerical advancements are urgently needed. In this thesis, we report progress on understanding two contexts in which the interplay of disorder and interactions is especially important.

The first is the so-called “dirty” or random boson problem. In the past decade, a strong-disorder renormalization group (SDRG) treatment by Altman, Kafri, Polkovnikov, and Refael has raised the possibility of a new unstable fixed point governing the superfluid-insulator transition in the one-dimensional dirty boson problem. This new critical behavior may take over from the weak-disorder criticality of Giamarchi and Schulz when disorder is sufficiently strong. We analytically determine the scaling of the superfluid susceptibility at the strong-disorder fixed point and connect our analysis to recent Monte Carlo simulations by Hrahsheh and Vojta. We then shift our attention to two dimensions and use a numerical implementation of the SDRG to locate the fixed point governing the superfluid-insulator transition there. We identify several universal properties of this transition, which are fully independent of the microscopic features of the disorder.

The second focus of this thesis is the interplay of localization and interactions in systems with high energy density (i.e., far from the usual low energy limit of condensed matter physics). Recent theoretical and numerical work indicates that localization can survive in this regime, provided that interactions are sufficiently weak. Stronger interactions can destroy localization, leading to a so-called many-body localization transition. This dynamical phase transition is relevant to questions of thermalization in isolated quantum systems: it separates a many-body localized phase, in which localization prevents transport and thermalization, from a conducting (“ergodic”) phase in which the usual assumptions of quantum statistical mechanics hold. Here, we present evidence that many-body localization also occurs in quasiperiodic systems that lack true disorder.