6 resultados para MapReduce


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multicore computational accelerators such as GPUs are now commodity components for highperformance computing at scale. While such accelerators have been studied in some detail as stand-alone computational engines, their integration in large-scale distributed systems raises new challenges and trade-offs. In this paper, we present an exploration of resource management alternatives for building asymmetric accelerator-based distributed systems. We present these alternatives in the context of a capabilities-aware framework for data-intensive computing, which uses an enhanced implementation of the MapReduce programming model for accelerator-based clusters, compared to the state of the art. The framework can transparently utilize heterogeneous accelerators for deriving high performance with low programming effort. Our work is the first to compare heterogeneous types of accelerators, GPUs and a Cell processors, in the same environment and the first to explore the trade-offs between compute-efficient and control-efficient accelerators on data-intensive systems. Our investigation shows that our framework scales well with the number of different compute nodes. Furthermore, it runs simultaneously on two different types of accelerators, successfully adapts to the resource capabilities, and performs 26.9% better on average than a static execution approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper introduces hybrid address spaces as a fundamental design methodology for implementing scalable runtime systems on many-core architectures without hardware support for cache coherence. We use hybrid address spaces for an implementation of MapReduce, a programming model for large-scale data processing, and the implementation of a remote memory access (RMA) model. Both implementations are available on the Intel SCC and are portable to similar architectures. We present the design and implementation of HyMR, a MapReduce runtime system whereby different stages and the synchronization operations between them alternate between a distributed memory address space and a shared memory address space, to improve performance and scalability. We compare HyMR to a reference implementation and we find that HyMR improves performance by a factor of 1.71× over a set of representative MapReduce benchmarks. We also compare HyMR with Phoenix++, a state-of-art implementation for systems with hardware-managed cache coherence in terms of scalability and sustained to peak data processing bandwidth, where HyMR demon- strates improvements of a factor of 3.1× and 3.2× respectively. We further evaluate our hybrid remote memory access (HyRMA) programming model and assess its performance to be superior of that of message passing.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Emerging web applications like cloud computing, Big Data and social networks have created the need for powerful centres hosting hundreds of thousands of servers. Currently, the data centres are based on general purpose processors that provide high flexibility buts lack the energy efficiency of customized accelerators. VINEYARD aims to develop an integrated platform for energy-efficient data centres based on new servers with novel, coarse-grain and fine-grain, programmable hardware accelerators. It will, also, build a high-level programming framework for allowing end-users to seamlessly utilize these accelerators in heterogeneous computing systems by employing typical data-centre programming frameworks (e.g. MapReduce, Storm, Spark, etc.). This programming framework will, further, allow the hardware accelerators to be swapped in and out of the heterogeneous infrastructure so as to offer high flexibility and energy efficiency. VINEYARD will foster the expansion of the soft-IP core industry, currently limited in the embedded systems, to the data-centre market. VINEYARD plans to demonstrate the advantages of its approach in three real use-cases (a) a bio-informatics application for high-accuracy brain modeling, (b) two critical financial applications, and (c) a big-data analysis application.