105 resultados para GPU computing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a general scheme for sequential one-way quantum computation where static systems with long-living quantum coherence (memories) interact with moving systems that may possess very short coherence times. Both the generation of the cluster state needed for the computation and its consumption by measurements are carried out simultaneously. As a consequence, effective clusters of one spatial dimension fewer than in the standard approach are sufficient for computation. In particular, universal computation requires only a one-dimensional array of memories. The scheme applies to discrete-variable systems of any dimension as well as to continuous-variable ones, and both are treated equivalently under the light of local complementation of graphs. In this way our formalism introduces a general framework that encompasses and generalizes in a unified manner some previous system-dependent proposals. The procedure is intrinsically well suited for implementations with atom-photon interfaces.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An extension of approximate computing, significance-based computing exploits applications' inherent error resiliency and offers a new structural paradigm that strategically relaxes full computational precision to provide significant energy savings with minimal performance degradation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a novel discrete cosine transform (DCT) architecture that allows aggressive voltage scaling for low-power dissipation, even under process parameter variations with minimal overhead as opposed to existing techniques. Under a scaled supply voltage and/or variations in process parameters, any possible delay errors appear only from the long paths that are designed to be less contributive to output quality. The proposed architecture allows a graceful degradation in the peak SNR (PSNR) under aggressive voltage scaling as well as extreme process variations. Results show that even under large process variations (±3σ around mean threshold voltage) and aggressive supply voltage scaling (at 0.88 V, while the nominal voltage is 1.2 V for a 90-nm technology), there is a gradual degradation of image quality with considerable power savings (71% at PSNR of 23.4 dB) for the proposed architecture, when compared to existing implementations in a 90-nm process technology. © 2006 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hardware designers and engineers typically need to explore a multi-parametric design space in order to find the best configuration for their designs using simulations that can take weeks to months to complete. For example, designers of special purpose chips need to explore parameters such as the optimal bitwidth and data representation. This is the case for the development of complex algorithms such as Low-Density Parity-Check (LDPC) decoders used in modern communication systems. Currently, high-performance computing offers a wide set of acceleration options, that range from multicore CPUs to graphics processing units (GPUs) and FPGAs. Depending on the simulation requirements, the ideal architecture to use can vary. In this paper we propose a new design flow based on OpenCL, a unified multiplatform programming model, which accelerates LDPC decoding simulations, thereby significantly reducing architectural exploration and design time. OpenCL-based parallel kernels are used without modifications or code tuning on multicore CPUs, GPUs and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL for mapping the simulations into FPGAs. To the best of our knowledge, this is the first time that a single, unmodified OpenCL code is used to target those three different platforms. We show that, depending on the design parameters to be explored in the simulation, on the dimension and phase of the design, the GPU or the FPGA may suit different purposes more conveniently, providing different acceleration factors. For example, although simulations can typically execute more than 3x faster on FPGAs than on GPUs, the overhead of circuit synthesis often outweighs the benefits of FPGA-accelerated execution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

No Abstract available

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goal of the POBICOS project is a platform that facilitates the development and deployment of pervasive computing applications destined for networked, cooperating objects. POBICOS object communities are heterogeneous in terms of the sensing, actuating, and computing resources contributed by each object. Moreover, it is assumed that an object community is formed without any master plan; for example, it may emerge as a by-product of acquiring everyday, POBICOS-enabled objects by a household. As a result, the target object community is, at least partially, unknown to the application programmer, and so a POBICOS application should be able to deliver its functionality on top of diverse object communities (we call this opportunistic computing). The POBICOS platform includes a middleware offering a programming model for opportunistic computing, as well as development and monitoring tools. This paper briefly describes the tools produced in the first phase of the project. Also, the stakeholders using these tools are identified, and a development process for both the middleware and applications is presented. © 2009 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes an end-user model for a domestic pervasive computing platform formed by regular home objects. The platform does not rely on pre-planned infrastructure; instead, it exploits objects that are already available in the home and exposes their joint sensing, actuating and computing capabilities to home automation applications. We advocate an incremental process of the platform formation and introduce tangible, object-like artifacts for representing important platform functions. One of those artifacts, the application pill, is a tiny object with a minimal user interface, used to carry the application, as well as to start and stop its execution and provide hints about its operational status. We also emphasize streamlining the user's interaction with the platform. The user engages any UI-capable object of his choice to configure applications, while applications issue notifications and alerts exploiting whichever available objects can be used for that purpose. Finally, the paper briefly describes an actual implementation of the presented end-user model. © (2010) by International Academy, Research, and Industry Association (IARIA).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Approximate execution is a viable technique for energy-con\-strained environments, provided that applications have the mechanisms to produce outputs of the highest possible quality within the given energy budget.
We introduce a framework for energy-constrained execution with controlled and graceful quality loss. A simple programming model allows users to express the relative importance of computations for the quality of the end result, as well as minimum quality requirements. The significance-aware runtime system uses an application-specific analytical energy model to identify the degree of concurrency and approximation that maximizes quality while meeting user-specified energy constraints. Evaluation on a dual-socket 8-core server shows that the proposed
framework predicts the optimal configuration with high accuracy, enabling energy-constrained executions that result in significantly higher quality compared to loop perforation, a compiler approximation technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a task-based programming model and runtime system that exploit the observation that not all parts of a program are equally significant for the accuracy of the end-result, in order to trade off the quality of program outputs for increased energy-efficiency. This is done in a structured and flexible way, allowing for easy exploitation of different points in the quality/energy space, without adversely affecting application performance. The runtime system can apply a number of different policies to decide whether it will execute less-significant tasks accurately or approximately.

The experimental evaluation indicates that our system can achieve an energy reduction of up to 83% compared with a fully accurate execution and up to 35% compared with an approximate version employing loop perforation. At the same time, our approach always results in graceful quality degradation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the computation of lower/upper expectations that must cohere with a collection of probabilistic assessments and a collection of judgements of epistemic independence. New algorithms, based on multilinear programming, are presented, both for independence among events and among random variables. Separation properties of graphical models are also investigated.