387 resultados para Parallelism
Resumo:
We present new algorithms which perform automatic parallelization via source-to-source transformations. The objective is to exploit goal-level, unrestricted independent andparallelism. The proposed algorithms use as targets new parallel execution primitives which are simpler and more flexible than the well-known &/2 parallel operator, which makes it possible to generate better parallel expressions by exposing more potential parallelism among the literals of a clause than is possible with &/2. The main differences between the algorithms stem from whether the order of the solutions obtained is preserved or not, and on the use of determinacy information. We briefly describe the environment where the algorithms have been implemented and the runtime platform in which the parallelized programs are executed. We also report on an evaluation of an implementation of our approach. We compare the performance obtained to that of previous annotation algorithms and show that relevant improvements can be obtained.
Resumo:
Logic programming systems which exploit and-parallelism among non-deterministic goals rely on notions of independence among those goals in order to ensure certain efficiency properties. "Non-strict" independence (NSI) is a more relaxed notion than the traditional notion of "strict" independence (SI) which still ensures the relevant efficiency properties and can allow considerable more parallelism than SI. However, all compilation technology developed to date has been based on SI, presumably because of the intrinsic complexity of exploiting NSI. This is related to the fact that NSI cannot be determined "a priori" as SI. This paper fills this gap by developing a technique for compile-time detection and annotation of NSI. It also proposes algorithms for combined compile- time/run-time detection, presenting novel run-time checks for this type of parallelism. Also, a transformation procedure to eliminate shared variables among parallel goals is presented, attempting to perform as much work as possible at compiletime. The approach is based on the knowledge of certain properties about run-time instantiations of program variables —sharing and freeness— for which compile-time technology is available, with new approaches being currently proposed.
Resumo:
El punto de vista de muchas otras aplicaciones que modifican las reglas de computación. En segundo lugar, y una vez generalizado el concepto de independencia, es necesario realizar un estudio exhaustivo de la efectividad de las herramientas de análisis en la tarea de la paralelizacion automática. Los resultados obtenidos de dicha evaluación permiten asegurar de forma empírica que la utilización de analizadores globales en la tarea de la paralelizacion automática es vital para la consecución de una paralelizarían efectiva. Por último, a la luz de los buenos resultados obtenidos sobre la efectividad de los analizadores de flujo globales basados en la interpretación abstracta, se presenta la generalización de las herramientas de análisis al contexto de los lenguajes lógicos restricciones y planificación dinámica.
Resumo:
Thesis (M.S.) - University of Illinois at Urbana-Champaign.
Resumo:
Microfilm of typescript. Ann Arbor, Mich. : University Microfilms, 1971.
Resumo:
"December 12, 1955"
Resumo:
Vita.
Resumo:
Mode of access: Internet.
Resumo:
Mode of access: Internet.
Resumo:
Mode of access: Internet.
Resumo:
Includes bibliographical references.
The effective use of implicit parallelism through the use of an object-oriented programming language
Resumo:
This thesis explores translating well-written sequential programs in a subset of the Eiffel programming language - without syntactic or semantic extensions - into parallelised programs for execution on a distributed architecture. The main focus is on constructing two object-oriented models: a theoretical self-contained model of concurrency which enables a simplified second model for implementing the compiling process. There is a further presentation of principles that, if followed, maximise the potential levels of parallelism. Model of Concurrency. The concurrency model is designed to be a straightforward target for mapping sequential programs onto, thus making them parallel. It aids the compilation process by providing a high level of abstraction, including a useful model of parallel behaviour which enables easy incorporation of message interchange, locking, and synchronization of objects. Further, the model is sufficient such that a compiler can and has been practically built. Model of Compilation. The compilation-model's structure is based upon an object-oriented view of grammar descriptions and capitalises on both a recursive-descent style of processing and abstract syntax trees to perform the parsing. A composite-object view with an attribute grammar style of processing is used to extract sufficient semantic information for the parallelisation (i.e. code-generation) phase. Programming Principles. The set of principles presented are based upon information hiding, sharing and containment of objects and the dividing up of methods on the basis of a command/query division. When followed, the level of potential parallelism within the presented concurrency model is maximised. Further, these principles naturally arise from good programming practice. Summary. In summary this thesis shows that it is possible to compile well-written programs, written in a subset of Eiffel, into parallel programs without any syntactic additions or semantic alterations to Eiffel: i.e. no parallel primitives are added, and the parallel program is modelled to execute with equivalent semantics to the sequential version. If the programming principles are followed, a parallelised program achieves the maximum level of potential parallelisation within the concurrency model.
Resumo:
ransition P-systems are based on biological membranes and try to emulate cell behavior and its evolution due to the presence of chemical elements. These systems perform computation through transition between two consecutive configurations, which consist in a m-tuple of multisets present at any moment in the existing m regions of the system. Transition between two configurations is performed by using evolution rules also present in each region. Among main Transition P-systems characteristics are massive parallelism and non determinism. This work is part of a very large project and tries to determine the design of a hardware circuit that can improve remarkably the process involved in the evolution of a membrane. Process in biological cells has two different levels of parallelism: the first one, obviously, is the evolution of each cell inside the whole set, and the second one is the application of the rules inside one membrane. This paper presents an evolution of the work done previously and includes an improvement that uses massive parallelism to do transition between two states. To achieve this, the initial set of rules is transformed into a new set that consists in all their possible combinations, and each of them is treated like a new rule (participant antecedents are added to generate a new multiset), converting an unique rule application in a way of parallelism in the means that several rules are applied at the same time. In this paper, we present a circuit that is able to process this kind of rules and to decode the result, taking advantage of all the potential that hardware has to implement P Systems versus previously proposed sequential solutions.
Resumo:
Heterogeneous computing systems have become common in modern processor architectures. These systems, such as those released by AMD, Intel, and Nvidia, include both CPU and GPU cores on a single die available with reduced communication overhead compared to their discrete predecessors. Currently, discrete CPU/GPU systems are limited, requiring larger, regular, highly-parallel workloads to overcome the communication costs of the system. Without the traditional communication delay assumed between GPUs and CPUs, we believe non-traditional workloads could be targeted for GPU execution. Specifically, this thesis focuses on the execution model of nested parallel workloads on heterogeneous systems. We have designed a simulation flow which utilizes widely used CPU and GPU simulators to model heterogeneous computing architectures. We then applied this simulator to non-traditional GPU workloads using different execution models. We also have proposed a new execution model for nested parallelism allowing users to exploit these heterogeneous systems to reduce execution time.