Biblioteca Digital

49 resultados para Parallelism

Runtime Scheduling of Dynamic Parallelism on Accelerator-Based Multi-core Systems

Relevância:

20.00% 20.00%

Publicador:

Veja mais

A profile-based tool for finding pipeline parallelism in sequential programs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional static analysis fails to auto-parallelize programs with a complex control and data flow. Furthermore, thread-level parallelism in such programs is often restricted to pipeline parallelism, which can be hard to discover by a programmer. In this paper we propose a tool that, based on profiling information, helps the programmer to discover parallelism. The programmer hand-picks the code transformations from among the proposed candidates which are then applied by automatic code transformation techniques.

This paper contributes to the literature by presenting a profiling tool for discovering thread-level parallelism. We track dependencies at the whole-data structure level rather than at the element level or byte level in order to limit the profiling overhead. We perform a thorough analysis of the needs and costs of this technique. Furthermore, we present and validate the belief that programs with complex control and data flow contain significant amounts of exploitable coarse-grain pipeline parallelism in the program’s outer loops. This observation validates our approach to whole-data structure dependencies. As state-of-the-art compilers focus on loops iterating over data structure members, this observation also explains why our approach finds coarse-grain pipeline parallelism in cases that have remained out of reach for state-of-the-art compilers. In cases where traditional compilation techniques do find parallelism, our approach allows to discover higher degrees of parallelism, allowing a 40% speedup over traditional compilation techniques. Moreover, we demonstrate real speedups on multiple hardware platforms.

Veja mais

Extracting Coarse-Grain Parallelism in General-Purpose Programs

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Efficient Dynamic Parallelism with OpenMP on Linux-Based SMPs

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Runtime Support for Integrating Precomputation and Thread-Level Parallelism on Simultaneous Multithreaded Processors

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Scheduling Asymmetric Parallelism on a PlayStation3 Cluster

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Modeling Multi-grain Parallelism on Heterogeneous Multicore Processors: A Case Study of the Cell BE

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Unified Scheduling of Polymorphic Parallelism on the Cell Processor:Abstracts of the 2008 SIAM Conference on Parallel Processing for Scientific Computing, Miniworkshop on the Cell Processor (SIAM PP)

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Scheduling Dynamic Parallelism on Accelerators

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Scheduling Dynamic Parallelism on the Cell BE

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Tagged Procedure Calls (TPC): Efficient Runtime Support for Task-Based Parallelism on the Cell Processor

Relevância:

20.00% 20.00%

Publicador:

Veja mais

A Unified Scheduler for Recursive and Task Dataflow Parallelism

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Task dataflow languages simplify the specification of parallel programs by dynamically detecting and enforcing dependencies between tasks. These languages are, however, often restricted to a single level of parallelism. This language design is reflected in the runtime system, where a master thread explicitly generates a task graph and worker threads execute ready tasks and wake-up their dependents. Such an approach is incompatible with state-of-the-art schedulers such as the Cilk scheduler, that minimize the creation of idle tasks (work-first principle) and place all task creation and scheduling off the critical path. This paper proposes an extension to the Cilk scheduler in order to reconcile task dependencies with the work-first principle. We discuss the impact of task dependencies on the properties of the Cilk scheduler. Furthermore, we propose a low-overhead ticket-based technique for dependency tracking and enforcement at the object level. Our scheduler also supports renaming of objects in order to increase task-level parallelism. Renaming is implemented using versioned objects, a new type of hyper object. Experimental evaluation shows that the unified scheduler is as efficient as the Cilk scheduler when tasks have no dependencies. Moreover, the unified scheduler is more efficient than SMPSS, a particular implementation of a task dataflow language.

Veja mais