853 resultados para parallel execution


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, applications in domains such as telecommunications, network security or large scale sensor networks showed the limits of the traditional store-then-process paradigm. In this context, Stream Processing Engines emerged as a candidate solution for all these applications demanding for high processing capacity with low processing latency guarantees. With Stream Processing Engines, data streams are not persisted but rather processed on the fly, producing results continuously. Current Stream Processing Engines, either centralized or distributed, do not scale with the input load due to single-node bottlenecks. Moreover, they are based on static configurations that lead to either under or over-provisioning. This Ph.D. thesis discusses StreamCloud, an elastic paralleldistributed stream processing engine that enables for processing of large data stream volumes. Stream- Cloud minimizes the distribution and parallelization overhead introducing novel techniques that split queries into parallel subqueries and allocate them to independent sets of nodes. Moreover, Stream- Cloud elastic and dynamic load balancing protocols enable for effective adjustment of resources depending on the incoming load. Together with the parallelization and elasticity techniques, Stream- Cloud defines a novel fault tolerance protocol that introduces minimal overhead while providing fast recovery. StreamCloud has been fully implemented and evaluated using several real word applications such as fraud detection applications or network analysis applications. The evaluation, conducted using a cluster with more than 300 cores, demonstrates the large scalability, the elasticity and fault tolerance effectiveness of StreamCloud. Resumen En los útimos años, aplicaciones en dominios tales como telecomunicaciones, seguridad de redes y redes de sensores de gran escala se han encontrado con múltiples limitaciones en el paradigma tradicional de bases de datos. En este contexto, los sistemas de procesamiento de flujos de datos han emergido como solución a estas aplicaciones que demandan una alta capacidad de procesamiento con una baja latencia. En los sistemas de procesamiento de flujos de datos, los datos no se persisten y luego se procesan, en su lugar los datos son procesados al vuelo en memoria produciendo resultados de forma continua. Los actuales sistemas de procesamiento de flujos de datos, tanto los centralizados, como los distribuidos, no escalan respecto a la carga de entrada del sistema debido a un cuello de botella producido por la concentración de flujos de datos completos en nodos individuales. Por otra parte, éstos están basados en configuraciones estáticas lo que conducen a un sobre o bajo aprovisionamiento. Esta tesis doctoral presenta StreamCloud, un sistema elástico paralelo-distribuido para el procesamiento de flujos de datos que es capaz de procesar grandes volúmenes de datos. StreamCloud minimiza el coste de distribución y paralelización por medio de una técnica novedosa la cual particiona las queries en subqueries paralelas repartiéndolas en subconjuntos de nodos independientes. Ademas, Stream- Cloud posee protocolos de elasticidad y equilibrado de carga que permiten una optimización de los recursos dependiendo de la carga del sistema. Unidos a los protocolos de paralelización y elasticidad, StreamCloud define un protocolo de tolerancia a fallos que introduce un coste mínimo mientras que proporciona una rápida recuperación. StreamCloud ha sido implementado y evaluado mediante varias aplicaciones del mundo real tales como aplicaciones de detección de fraude o aplicaciones de análisis del tráfico de red. La evaluación ha sido realizada en un cluster con más de 300 núcleos, demostrando la alta escalabilidad y la efectividad tanto de la elasticidad, como de la tolerancia a fallos de StreamCloud.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a static analysis that infers both upper and lower bounds on the usage that a logic program makes of a set of user-definable resources. The inferred bounds will in general be functions of input data sizes. A resource in our approach is a quite general, user-defined notion which associates a basic cost function with elementary operations. The analysis then derives the related (upper- and lower-bound) resource usage functions for all predicates in the program. We also present an assertion language which is used to define both such resources and resourcerelated properties that the system can then check based on the results of the analysis. We have performed some preliminary experiments with some concrete resources such as execution steps, bytes sent or received by an application, number of files left open, number of accesses to a datábase, number of calis to a procedure, number of asserts/retracts, etc. Applications of our analysis include resource consumption verification and debugging (including for mobile code), resource control in parallel/distributed computing, and resource-oriented specialization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modeling the evolution of the state of program memory during program execution is critical to many parallehzation techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs at compile time, which can accurately determine many important program properties such as aliasing, logical data structures and shape. These properties are known to be critical for transforming a single threaded program into a versión that can be run on múltiple execution units in parallel. The analysis is shown to be of polynomial complexity in the size of the memory heap. Experimental results for benchmarks in the Jolden suite are given. These results show that in practice the analysis method is efflcient and is capable of accurately determining shape information in programs that créate and manipúlate complex data structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years a lot of research has been invested in parallel processing of numerical applications. However, parallel processing of Symbolic and AI applications has received less attention. This paper presents a system for parallel symbolic computitig, narned ACE, based on the logic programming paradigm. ACE is a computational model for the full Prolog language, capable of exploiting Or-parall< lism and Independent And-parallelism. In this paper vve focus on the implementation of the and-parallel part of the ACE system (ralled &ACE) on a shared memory multiprocessor, d< scribing its organization, some optimizations, and presenting some performance figures, proving the abilhy of &ACE to efficiently exploit parallelism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a parallel graph narrowing machine, which is used to implement a functional logic language on a shared memory multiprocessor. It is an extensión of an abstract machine for a purely functional language. The result is a programmed graph reduction machine which integrates the mechanisms of unification, backtracking, and independent and-parallelism. In the machine, the subexpressions of an expression can run in parallel. In the case of backtracking, the structure of an expression is used to avoid the reevaluation of subexpressions as far as possible. Deterministic computations are detected. Their results are maintained and need not be reevaluated after backtracking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While logic programming languages offer a great deal of scope for parallelism, there is usually some overhead associated with the execution of goals in parallel because of the work involved in task creation and scheduling. In practice, therefore, the "granularity" of a goal, i.e. an estimate of the work available under it, should be taken into account when deciding whether or not to execute a goal concurrently as a sepárate task. This paper describes a method for estimating the granularity of a goal at compile time. The runtime overhead associated with our approach is usually quite small, and the performance improvements resulting from the incorporation of grainsize control can be quite good. This is shown by means of experimental results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An Independent And-Parallel Prolog model and implementation, &-Prolog, are described. The description includes a summary of the system's architecture, some details of its execution model (based on the RAP-WAM model), and most importantly, its performance on sequential workstations and shared memory multiprocessors as compared with state-of-the-art Prolog systems. Speedup curves are provided for a collection of benchmark programs which demónstrate significant speed advantages over state-of the art sequential systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents and proves some fundamental results for independent and-parallelism (IAP). First, the paper treats the issues of correctness and efficiency: after defining strict and non-strict goal independence, it is proved that if strictly independent goals are executed in parallel the solutions obtained are the same as those produced by standard sequential execution. It is also shown that, in the absence of failure, the parallel proof procedure doesn't genérate any additional work (with respect to standard SLDresolution) while the actual execution time is reduced. The same results hold even if non-strictly independent goals are executed in parallel, provided a trivial rewriting of such goals is performed. In addition, and most importantly, treats the issue of compile-time generation of IAP by proposing conditions, to be written at compile-time, to efficiently check strict and non-strict goal independence at run-time and proving the sufficiency of such conditions. It is also shown how simpler conditions can be constructed if some information regarding the binding context of the goals to be executed in parallel is available to the compiler trough either local or program-level analysis. These results therefore provide a formal basis for the automatic compile-time generation of IAP. As a corollary of such results, the paper also proves that negative goals are always non-strictly independent, and that goals which share a first occurrence of an existential variable are never independent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It has been shown that it is possible to exploit Independent/Restricted And-parallelism in logic programs while retaining the conventional "don't know" semantics of such programs. In particular, it is possible to parallelize pure Prolog programs while maintaining the semantics of the language. However, when builtin side-effects (such as write or assert) appear in the program, if an identical observable behaviour to that of sequential Prolog implementations is to be preserved, such side-effects have to be properly sequenced. Previously proposed solutions to this problem are either incomplete (lacking, for example, backtracking semantics) or they force sequentialization of significant portions of the execution graph which could otherwise run in parallel. In this paper a series of side-effect synchronization methods are proposed which incur lower overhead and allow more parallelism than those previously proposed. Most importantly, and unlike previous proposals, they have well-defined backward execution behaviour and require only a small modification to a given (And-parallel) Prolog implementation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds significantly beyond those of sequential systems, while supporting Prolog semantics and preserving sequential performance and storage efficiency. This paper presents simulation results supporting these claims with special emphasis on memory performance on a two-level sharedmemory multiprocessor organization. Several solutions to the cache coherency problem are analyzed. It is shown that RAP-WAM offers good locality and storage efficiency and that it can effectively take advantage of broadcast caches. It is argued that speeds in excess of 2 ML IPS on real applications exhibiting medium parallelism can be attained with current technology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The advantages of tabled evaluation regarding program termination and reduction of complexity are well known —as are the significant implementation, portability, and maintenance efforts that some proposals (especially those based on suspensión) require. This implementation effort is reduced by program transformation-based continuation cali techniques, at some eñrciency cost. However, the traditional formulation of this proposal by Ramesh and Cheng limits the interleaving of tabled and non-tabled predicates and thus cannot be used as-is for arbitrary programs. In this paper we present a complete translation for the continuation cali technique which, using the runtime support needed for the traditional proposal, solves these problems and makes it possible to execute arbitrary tabled programs. We present performance results which show that CCall offers a useful tradeoff that can be competitive with state-of-the-art implementations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visualization of program executions has been found useful in applications which include education and debugging. However, traditional visualization techniques often fall short of expectations or are altogether inadequate for new programming paradigms, such as Constraint Logic Programming (CLP), whose declarative and operational semantics differ in some crucial ways from those of other paradigms. In particular, traditional ideas regarding flow control and the behavior of data often cannot be lifted in a straightforward way to (C)LP from other families of programming languages. In this paper we discuss techniques for visualizing program execution and data evolution in CLP. We briefly review some previously proposed visualization paradigms, and also propose a number of (to our knowledge) novel ones. The graphical representations have been chosen based on the perceived needs of a programmer trying to analyze the behavior and characteristics of an execution. In particular, we concéntrate on the representation of the program execution behavior (control), the runtime valúes of the variables, and the runtime constraints. Given our interest in visualizing large executions, we also pay attention to abstraction techniques, Le., techniques which are intended to help in reducing the complexity of the visual information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the current prototype of the distributed CIAO system. It introduces the concepts of "teams" and "active modules" (or active objects), which conveniently encapsulate different types of functionalities desirable from a distributed system, from parallelism for achieving speedup to client-server applications. The user primitives available are presented and their implementation described. This implementation uses attributed variables and, as an example of a communication abstraction, a blackboard that follows the Linda model. Finally, the CIAO WWW interface is also briefly described. The unctionalities of the system are illustrated through examples, using the implemented primitives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the current prototype of the distributed CIAO system. It introduces the concepts of "teams" and "active modules" (or active objects), which conveniently encapsulate different types of functionalities desirable from a distributed system, from parallelism for achieving speedup to client-server applications. It presents the user primitives available and describes their implementation. This implementation uses attributed variables and, as an example of a communication abstraction, a blackboard that follows the Linda model. The functionalities of the system are illustrated through examples, using the implemented primitives. The implementation of most of the primitives is also described in detail.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract is not available.