10 resultados para System Architectures

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

40.00% 40.00%

Publicador:

Resumo:

DRAM technology faces density and power challenges to increase capacity because of limitations of physical cell design. To overcome these limitations, system designers are exploring alternative solutions that combine DRAM and emerging NVRAM technologies. Previous work on heterogeneous memories focuses, mainly, on two system designs: PCache, a hierarchical, inclusive memory system, and HRank, a flat, non-inclusive memory system. We demonstrate that neither of these designs can universally achieve high performance and energy efficiency across a suite of HPC workloads. In this work, we investigate the impact of a number of multilevel memory designs on the performance, power, and energy consumption of applications. To achieve this goal and overcome the limited number of available tools to study heterogeneous memories, we created HMsim, an infrastructure that enables n-level, heterogeneous memory studies by leveraging existing memory simulators. We, then, propose HpMC, a new memory controller design that combines the best aspects of existing management policies to improve performance and energy. Our energy-aware memory management system dynamically switches between PCache and HRank based on the temporal locality of applications. Our results show that HpMC reduces energy consumption from 13% to 45% compared to PCache and HRank, while providing the same bandwidth and higher capacity than a conventional DRAM system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a data flow based run time system as an efficient tool for supporting execution of parallel code on heterogeneous architectures hosting both multicore CPUs and GPUs. We discuss how the proposed run time system may be the target of both structured parallel applications developed using algorithmic skeletons/parallel design patterns and also more "domain specific" programming models. Experimental results demonstrating the feasibility of the approach are presented. © 2012 World Scientific Publishing Company.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bit-level systolic-array structures for computing sums of products are studied in detail. It is shown that these can be subdivided into two classes and that within each class architectures can be described in terms of a set of constraint equations. It is further demonstrated that high-performance system-level functions with attractive VLSI properties can be constructed by matching data-flow geometries in bit-level and word-level architectures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A methodology for the production of silicon cores for wavelet packet decomposition has been developed. The scheme utilizes efficient scalable architectures for both orthonormal and biorthogonal wavelet transforms. The cores produced from these architectures can be readily scaled for any wavelet function and are easily configurable for any subband structure. The cores are fully parameterized in terms of wavelet choice and appropriate wordlengths. Designs produced are portable across a range of silicon foundries as well as FPGA and PLD technologies. A number of exemplar implementations have been produced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces hybrid address spaces as a fundamental design methodology for implementing scalable runtime systems on many-core architectures without hardware support for cache coherence. We use hybrid address spaces for an implementation of MapReduce, a programming model for large-scale data processing, and the implementation of a remote memory access (RMA) model. Both implementations are available on the Intel SCC and are portable to similar architectures. We present the design and implementation of HyMR, a MapReduce runtime system whereby different stages and the synchronization operations between them alternate between a distributed memory address space and a shared memory address space, to improve performance and scalability. We compare HyMR to a reference implementation and we find that HyMR improves performance by a factor of 1.71× over a set of representative MapReduce benchmarks. We also compare HyMR with Phoenix++, a state-of-art implementation for systems with hardware-managed cache coherence in terms of scalability and sustained to peak data processing bandwidth, where HyMR demon- strates improvements of a factor of 3.1× and 3.2× respectively. We further evaluate our hybrid remote memory access (HyRMA) programming model and assess its performance to be superior of that of message passing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

They’re cheap. They’re in every settlement of significance in Britain, Ireland and elsewhere. We all use them but perhaps do not always admit to it. Especially, if we are architects.
Over the past decades Aldi/Lidl low cost supermarkets have escaped from middle Europe to take over large tracts of the English speaking world remaking them according to a formula of mass-produced sheds, buff-coloured cobble-lock car parks, logos in primary colours, bare-shelves and eclectic special offers. Response within architectural discourse to this phenomenon has been largely one of indifference and such places remain, perhaps reiterating Pevsner’s controversial insights into the bicycle shed, on the peripheries of what we might term architecture. This paper seeks to explore the spatial complexities of the discount supermarket and in doing so open up a discussion on the architecture of cheapness. As a road-map, it takes former managing director Dieter Brandes’ treatise on the Aldi formula, Bare Essentials: the Aldi Way to Retailing, and investigates the strategies through which economic exigencies manifest themselves in a series of spatial tactics which involve building. Central to this is the idea of architecture as system rather than form and, in Aldi/Lidl’s case, the result of a spatial network of flows. To understand the architecture of the supermarket, then, it is necessary to measure the times and spaces of supply across the scales of intersection between global and local.
Evaluating the energy, economy and precision of such systems challenges the liminal position of the commercial, the placeless and especially the cheap within architectural discourse. As is well known, architectures of mass-production and prefabrication and their origins exercised modernist thinkers such as Sigfried Giedion and Walter Gropius in the early twentieth century and has undergone a resurgence in recent times. Meanwhile, the mapping of the hitherto overlooked forms and iconography of commerce in Learning from Las Vegas (1971) was extended by Rem Koolhaas et al into an investigation of the technologies, systems and precedents of retail in the Harvard Design School Guide to Shopping, thirty years later in 2001. While obviously always a criteria for building, to find writings on architecture which explicitly celebrate cheapness as a design virtue or, indeed, even iterate the word cheap is more difficult. Walter Gropius’ essay ‘How can we build cheaper, better, more attractive houses?’ (1927), however, situates the cheap within the discussions – articulated, amongst others, by Karl Teige and Bruno Taut – surrounding the minimal dwelling and the moral benefits of absence of the 1920s and 30s.
In our contemporary age of heightened consumption, it is perhaps fitting that an architecture of bare essentials is defined in retail rather than in housing, a commercial existenzminimum where the Miesian paradox of ‘less is more’ is resold as a paradigm of ‘more for less’ in the ubiquitous yet overlooked architectures of the discount supermarket.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Graph analytics is an important and computationally demanding class of data analytics. It is essential to balance scalability, ease-of-use and high performance in large scale graph analytics. As such, it is necessary to hide the complexity of parallelism, data distribution and memory locality behind an abstract interface. The aim of this work is to build a scalable graph analytics framework that does not demand significant parallel programming experience based on NUMA-awareness.
The realization of such a system faces two key problems:
(i)~how to develop a scale-free parallel programming framework that scales efficiently across NUMA domains; (ii)~how to efficiently apply graph partitioning in order to create separate and largely independent work items that can be distributed among threads.