157 resultados para parallelization


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhead

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Studying independence of goals has proven very useful in the context of logic programming. In particular, it has provided a formal basis for powerful automatic parallelization tools, since independence ensures that two goals may be evaluated in parallel while preserving correctness and eciency. We extend the concept of independence to constraint logic programs (CLP) and prove that it also ensures the correctness and eciency of the parallel evaluation of independent goals. Independence for CLP languages is more complex than for logic programming as search space preservation is necessary but no longer sucient for ensuring correctness and eciency. Two additional issues arise. The rst is that the cost of constraint solving may depend upon the order constraints are encountered. The second is the need to handle dynamic scheduling. We clarify these issues by proposing various types of search independence and constraint solver independence, and show how they can be combined to allow dierent optimizations, from parallelism to intelligent backtracking. Sucient conditions for independence which can be evaluated \a priori" at run-time are also proposed. Our study also yields new insights into independence in logic programming languages. In particular, we show that search space preservation is not only a sucient but also a necessary condition for ensuring correctness and eciency of parallel execution.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present two concurrent semantics (i.e. semantics where concurrency is explicitely represented) for CC programs with atomic tells. One is based on simple partial orders of computation steps, while the other one is based on contextual nets and it is an extensión of a previous one for eventual CC programs. Both such semantics allow us to derive concurrency, dependency, and nondeterminism information for the considered languages. We prove some properties about the relation between the two semantics, and also about the relation between them and the operational semantics. Moreover, we discuss how to use the contextual net semantics in the context of CLP programs. More precisely, by interpreting concurrency as possible parallelism, our semantics can be useful for a safe parallelization of some CLP computation steps. Dually, the dependency information may also be interpreted as necessary sequentialization, thus possibly exploiting it for the task of scheduling CC programs. Moreover, our semantics is also suitable for CC programs with a new kind of atomic tell (called locally atomic tell), which checks for consistency only the constraints it depends on. Such a tell achieves a reasonable trade-off between efficiency and atomicity, since the checked constraints can be stored in a local memory and are thus easily accessible even in a distributed implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In recent years, applications in domains such as telecommunications, network security or large scale sensor networks showed the limits of the traditional store-then-process paradigm. In this context, Stream Processing Engines emerged as a candidate solution for all these applications demanding for high processing capacity with low processing latency guarantees. With Stream Processing Engines, data streams are not persisted but rather processed on the fly, producing results continuously. Current Stream Processing Engines, either centralized or distributed, do not scale with the input load due to single-node bottlenecks. Moreover, they are based on static configurations that lead to either under or over-provisioning. This Ph.D. thesis discusses StreamCloud, an elastic paralleldistributed stream processing engine that enables for processing of large data stream volumes. Stream- Cloud minimizes the distribution and parallelization overhead introducing novel techniques that split queries into parallel subqueries and allocate them to independent sets of nodes. Moreover, Stream- Cloud elastic and dynamic load balancing protocols enable for effective adjustment of resources depending on the incoming load. Together with the parallelization and elasticity techniques, Stream- Cloud defines a novel fault tolerance protocol that introduces minimal overhead while providing fast recovery. StreamCloud has been fully implemented and evaluated using several real word applications such as fraud detection applications or network analysis applications. The evaluation, conducted using a cluster with more than 300 cores, demonstrates the large scalability, the elasticity and fault tolerance effectiveness of StreamCloud. Resumen En los útimos años, aplicaciones en dominios tales como telecomunicaciones, seguridad de redes y redes de sensores de gran escala se han encontrado con múltiples limitaciones en el paradigma tradicional de bases de datos. En este contexto, los sistemas de procesamiento de flujos de datos han emergido como solución a estas aplicaciones que demandan una alta capacidad de procesamiento con una baja latencia. En los sistemas de procesamiento de flujos de datos, los datos no se persisten y luego se procesan, en su lugar los datos son procesados al vuelo en memoria produciendo resultados de forma continua. Los actuales sistemas de procesamiento de flujos de datos, tanto los centralizados, como los distribuidos, no escalan respecto a la carga de entrada del sistema debido a un cuello de botella producido por la concentración de flujos de datos completos en nodos individuales. Por otra parte, éstos están basados en configuraciones estáticas lo que conducen a un sobre o bajo aprovisionamiento. Esta tesis doctoral presenta StreamCloud, un sistema elástico paralelo-distribuido para el procesamiento de flujos de datos que es capaz de procesar grandes volúmenes de datos. StreamCloud minimiza el coste de distribución y paralelización por medio de una técnica novedosa la cual particiona las queries en subqueries paralelas repartiéndolas en subconjuntos de nodos independientes. Ademas, Stream- Cloud posee protocolos de elasticidad y equilibrado de carga que permiten una optimización de los recursos dependiendo de la carga del sistema. Unidos a los protocolos de paralelización y elasticidad, StreamCloud define un protocolo de tolerancia a fallos que introduce un coste mínimo mientras que proporciona una rápida recuperación. StreamCloud ha sido implementado y evaluado mediante varias aplicaciones del mundo real tales como aplicaciones de detección de fraude o aplicaciones de análisis del tráfico de red. La evaluación ha sido realizada en un cluster con más de 300 núcleos, demostrando la alta escalabilidad y la efectividad tanto de la elasticidad, como de la tolerancia a fallos de StreamCloud.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The &-Prolog system, a practical implementation of a parallel execution niodel for Prolog exploiting strict and non-strict independent and-parallelism, is described. Both automatic and manual parallelization of programs is supported. This description includes a summary of the system's language and architecture, some details of its execution model (based on the RAP-WAM model), and data on its performance on sequential workstations and shared memory multiprocessors, which is compared to that of current Prolog systems. The results to date show significant speed advantages over state-of-the-art sequential systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Set-Sharing domain has been widely used to infer at compiletime interesting properties of logic programs such as occurs-check reduction, automatic parallelization, and flnite-tree analysis. However, performing abstract uniflcation in this domain requires a closure operation that increases the number of sharing groups exponentially. Much attention has been given to mitigating this key inefflciency in this otherwise very useful domain. In this paper we present a novel approach to Set-Sharing: we define a new representation that leverages the complement (or negative) sharing relationships of the original sharing set, without loss of accuracy. Intuitively, given an abstract state sh\> over the finite set of variables of interest V, its negative representation is p(V) \ shy. Using this encoding during analysis dramatically reduces the number of elements that need to be represented in the abstract states and during abstract uniflcation as the cardinality of the original set grows toward 2 . To further compress the number of elements, we express the set-sharing relationships through a set of ternary strings that compacts the representation by eliminating redundancies among the sharing sets. Our experiments show that our approach can compress the number of relationships, reducing signiflcantly the memory usage and running time of all abstract operations, including abstract uniflcation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present two new algorithms which perform automatic parallelization via source-to-source transformations. The objective is to exploit goal-level, unrestricted independent and-parallelism. The proposed algorithms use as targets new parallel execution primitives which are simpler and more flexible than the well-known &/2 parallel operator. This makes it possible to genérate better parallel expressions by exposing more potential parallelism among the literals of a clause than is possible with &/2. The difference between the two algorithms stems from whether the order of the solutions obtained is preserved or not. We also report on a preliminary evaluation of an implementation of our approach. We compare the performance obtained to that of previous annotation algorithms and show that relevant improvements can be obtained.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Non-failure analysis aims at inferring that predicate calis in a program will never fail. This type of information has many applications in functional/logic programming. It is essential for determining lower bounds on the computational cost of calis, useful in the context of program parallelization, instrumental in partial evaluation and other program transformations, and has also been used in query optimization. In this paper, we re-cast the non-failure analysis proposed by Debray et al. as an abstract interpretation, which not only allows to investígate it from a standard and well understood theoretical framework, but has also several practical advantages. It allows us to incorpórate non-failure analysis into a standard, generic abstract interpretation engine. The analysis thus benefits from the fixpoint propagation algorithm, which leads to improved information propagation. Also, the analysis takes advantage of the multi-variance of the generic engine, so that it is now able to infer sepárate non-failure information for different cali patterns. Moreover, the implementation is simpler, and allows to perform non-failure and covering analyses alongside other analyses, such as those for modes and types, in the same framework. Finally, besides the precisión improvements and the additional simplicity, our implementation (in the Ciao/CiaoPP multiparadigm programming system) also shows better efRciency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a technique for achieving a class of optimizations related to the reduction of checks within cycles. The technique uses both Program Transformation and Abstract Interpretation. After a ñrst pass of an abstract interpreter which detects simple invariants, program transformation is used to build a hypothetical situation that simpliñes some predicates that should be executed within the cycle. This transformation implements the heuristic hypothesis that once conditional tests hold they may continué doing so recursively. Specialized versions of predicates are generated to detect and exploit those cases in which the invariance may hold. Abstract interpretation is then used again to verify the truth of such hypotheses and conñrm the proposed simpliñcation. This allows optimizations that go beyond those possible with only one pass of the abstract interpreter over the original program, as is normally the case. It also allows selective program specialization using a standard abstract interpreter not speciñcally designed for this purpose, thus simplifying the design of this already complex module of the compiler. In the paper, a class of programs amenable to such optimization is presented, along with some examples and an evaluation of the proposed techniques in some application áreas such as floundering detection and reducing run-time tests in automatic logic program parallelization. The analysis of the examples presented has been performed automatically by an implementation of the technique using existing abstract interpretation and program transformation tools.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents and develops a generalized concept of Non-Strict Independent And Parallelism (NSIAP). NSIAP extends the applicability of Independent And- Parallelism (IAP) by enlarging the class of goals which are eligible for parallel execution. At the same time it maintains IAP's ability to run non-deterministic goals in parallel and to preserve the computational complexity expected in the execution of the program by the programmer. First, a parallel execution framework is defined and some fundamental correctness results, in the sense of equivalence of solutions with the sequential model, are discussed for this framework. The issue of efficiency is then considered. Two new definitions of NSI are given for the cases of puré and impure goals respectively and efficiency results are provided for programs parallelized under these definitions which include treatment of the case of goal failure: not only is reduction of execution time guaranteed (modulo run-time overheads) in the absence of failure but it is also shown that in the worst case of failure no speed-down will occur. In addition to applying to NSI, these results carry over and complete previous results shown in the context of IAP which did not deal with the case of goal failure. Finally, some practical examples of the application of the NSIAP concept to the parallelization of a set of programs are presented and performance results, showing the advantage of using NSI, are given.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ciao is a logic-based, multi-paradigm programming system. One of its most distinguishing features is that it supports a large number of semantic and syntactic language features which can be selectively activated or deactivated for each program module. As a result, a module can be written in, for example, ISO-Prolog plus constraints and higher order, while another can be a puré logic module with a different control rule such as iterative deepening and/or tabling, and perhaps using constructive negation. A powerful and modular extensión mechanism allows user-level design and implementation of such features and sub-languages. Another distinguishing feature of Ciao is its powerful assertion language, which allows expressing many kinds of program properties (ranging from, e.g., moded types to resource consumption), as well as tests and documentation. The compiler is capable of statically ñnding violations of these properties or verifying that programs comply with them, and issuing certiñcates of this compliance. The compiler also performs many types of optimizations, including automatic parallelization. It offers very competitive performance, while retaining the flexibility and interactive development of a dynamic language. We will present a hands-on overview of the system, through small examples which emphasize the novel aspects and the motivations which lie behind Ciao's design and implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

CiaoPP is the abstract interpretation-based preprocessor of the Ciao multi-paradigm (Constraint) Logic Programming system. It uses modular, incremental abstract interpretation as a fundamental tool to obtain information about programs. In CiaoPP, the semantic approximations thus produced have been applied to perform high- and low-level optimizations during program compilation, including transformations such as múltiple abstract specialization, parallelization, partial evaluation, resource usage control, and program verification. More recently, novel and promising applications of such semantic approximations are being applied in the more general context of program development such as program verification. In this work, we describe our extensión of the system to incorpórate Abstraction-Carrying Code (ACC), a novel approach to mobile code safety. ACC follows the standard strategy of associating safety certificates to programs, originally proposed in Proof Carrying- Code. A distinguishing feature of ACC is that we use an abstraction (or abstract model) of the program computed by standard static analyzers as a certifícate. The validity of the abstraction on the consumer side is checked in a single-pass by a very efficient and specialized abstractinterpreter. We have implemented and benchmarked ACC within CiaoPP. The experimental results show that the checking phase is indeed faster than the proof generation phase, and that the sizes of certificates are reasonable. Moreover, the preprocessor is based on compile-time (and run-time) tools for the certification of CLP programs with resource consumption assurances.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Distributed parallel execution systems speed up applications by splitting tasks into processes whose execution is assigned to different receiving nodes in a high-bandwidth network. On the distributing side, a fundamental problem is grouping and scheduling such tasks such that each one involves sufñcient computational cost when compared to the task creation and communication costs and other such practical overheads. On the receiving side, an important issue is to have some assurance of the correctness and characteristics of the code received and also of the kind of load the particular task is going to pose, which can be specified by means of certificates. In this paper we present in a tutorial way a number of general solutions to these problems, and illustrate them through their implementation in the Ciao multi-paradigm language and program development environment. This system includes facilities for parallel and distributed execution, an assertion language for specifying complex programs properties (including safety and resource-related properties), and compile-time and run-time tools for performing automated parallelization and resource control, as well as certification of programs with resource consumption assurances and efñcient checking of such certificates.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of program specialization is to optimize programs by exploiting certain knowledge about the context in which the program will execute. There exist many program manipulation techniques which allow specializing the program in different ways. Among them, one of the best known techniques is partial evaluation, often referred to simply as program specialization, which optimizes programs by specializing them for (partially) known input data. In this work we describe abstract specialization, a technique whose main features are: (1) specialization is performed with respect to "abstract" valúes rather than "concrete" ones, and (2) abstract interpretation rather than standard interpretation of the program is used in order to propágate information about execution states. The concept of abstract specialization is at the heart of the specialization system in CiaoPP, the Ciao system preprocessor. In this paper we present a unifying view of the different specialization techniques used in CiaoPP and discuss their potential applications by means of examples. The applications discussed include program parallelization, optimization of dynamic scheduling (concurreney), and integration of partial evaluation techniques.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system librarles), to genérate and simplify run-time tests, and to perform high-level program transformations such as múltiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas.