765 resultados para Lunch programs
Resumo:
We present new algorithms which perform automatic parallelization via source-to-source transformations. The objective is to exploit goal-level, unrestricted independent andparallelism. The proposed algorithms use as targets new parallel execution primitives which are simpler and more flexible than the well-known &/2 parallel operator, which makes it possible to generate better parallel expressions by exposing more potential parallelism among the literals of a clause than is possible with &/2. The main differences between the algorithms stem from whether the order of the solutions obtained is preserved or not, and on the use of determinacy information. We briefly describe the environment where the algorithms have been implemented and the runtime platform in which the parallelized programs are executed. We also report on an evaluation of an implementation of our approach. We compare the performance obtained to that of previous annotation algorithms and show that relevant improvements can be obtained.
Resumo:
Finding useful sharing information between instances in object- oriented programs has recently been the focus of much research. The applications of such static analysis are multiple: by knowing which variables definitely do not share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly, verify certain correctness aspects of programs even in the absence of annotations. In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways, including supporting multivariance, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances do not share and which ones are definitively null, and which uses the classes to refine the static information when inheritance is present. The use of a set sharing abstraction allows a more precise representation of the existing sharings and is crucial in achieving precision during interprocedural analysis. Carrying the domains in a combined way facilitates the interaction among them in the presence of multivariance in the analysis. We show through examples and experimentally that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost.
Resumo:
Effective static analyses have been proposed which allow inferring functions which bound the number of resolutions or reductions. These have the advantage of being independent from the platform on which the programs are executed and such bounds have been shown useful in a number of applications, such as granularity control in parallel execution. On the other hand, in certain distributed computation scenarios where different platforms come into play, with each platform having different capabilities, it is more interesting to express costs in metrics that include the characteristics of the platform. In particular, it is specially interesting to be able to infer upper and lower bounds on actual execution time. With this objective in mind, we propose a method which allows inferring upper and lower bounds on the execution times of procedures of a program in a given execution platform. The approach combines compile-time cost bounds analysis with a one-time profiling of the platform in order to determine the values of certain constants for that platform. These constants calibrate a cost model which from then on is able to compute statically time bound functions for procedures and to predict with a significant degree of accuracy the execution times of such procedures in the given platform. The approach has been implemented and integrated in the CiaoPP system.
Resumo:
Abstract interpreters rely on the existence of a nxpoint algorithm that calculates a least upper bound approximation of the semantics of the program. Usually, that algorithm is described in terms of the particular language in study and therefore it is not directly applicable to programs written in a different source language. In this paper we introduce a generic, block-based, and uniform representation of the program control flow graph and a language-independent nxpoint algorithm that can be applied to a variety of languages and, in particular, Java. Two major characteristics of our approach are accuracy (obtained through a topdown, context sensitive approach) and reasonable efficiency (achieved by means of memoization and dependency tracking techniques). We have also implemented the proposed framework and show some initial experimental results for standard benchmarks, which further support the feasibility of the solution adopted.
Resumo:
Dynamic scheduling increases the expressive power of logic programming languages, but also introduces some overhead. In this paper we present two classes of program transformations designed to reduce this additional overhead, while preserving the operational semantics of the original programs, modulo ordering of literals woken at the same time. The first class of transformations simplifies the delay conditions while the second class moves delayed literals later in the rule body. Application of the program transformations can be automated using information provided by compile-time analysis. We provide experimental results obtained from an implementation of the proposed techniques using the CIAO prototype compiler. Our results show that the techniques can lead to substantial performance improvement.
Resumo:
The term "Logic Programming" refers to a variety of computer languages and execution models which are based on the traditional concept of Symbolic Logic. The expressive power of these languages offers promise to be of great assistance in facing the programming challenges of present and future symbolic processing applications in Artificial Intelligence, Knowledge-based systems, and many other areas of computing. The sequential execution speed of logic programs has been greatly improved since the advent of the first interpreters. However, higher inference speeds are still required in order to meet the demands of applications such as those contemplated for next generation computer systems. The execution of logic programs in parallel is currently considered a promising strategy for attaining such inference speeds. Logic Programming in turn appears as a suitable programming paradigm for parallel architectures because of the many opportunities for parallel execution present in the implementation of logic programs. This dissertation presents an efficient parallel execution model for logic programs. The model is described from the source language level down to an "Abstract Machine" level suitable for direct implementation on existing parallel systems or for the design of special purpose parallel architectures. Few assumptions are made at the source language level and therefore the techniques developed and the general Abstract Machine design are applicable to a variety of logic (and also functional) languages. These techniques offer efficient solutions to several areas of parallel Logic Programming implementation previously considered problematic or a source of considerable overhead, such as the detection and handling of variable binding conflicts in AND-Parallelism, the specification of control and management of the execution tree, the treatment of distributed backtracking, and goal scheduling and memory management issues, etc. A parallel Abstract Machine design is offered, specifying data areas, operation, and a suitable instruction set. This design is based on extending to a parallel environment the techniques introduced by the Warren Abstract Machine, which have already made very fast and space efficient sequential systems a reality. Therefore, the model herein presented is capable of retaining sequential execution speed similar to that of high performance sequential systems, while extracting additional gains in speed by efficiently implementing parallel execution. These claims are supported by simulations of the Abstract Machine on sample programs.
Resumo:
We propose an analysis for detecting procedures and goals that are deterministic (i.e., that produce at most one solution at most once), or predicates whose clause tests are mutually exclusive (which implies that at most one of their clauses will succeed) even if they are not deterministic. The analysis takes advantage of the pruning operator in order to improve the detection of mutual exclusion and determinacy. It also supports arithmetic equations and disequations, as well as equations and disequations on terms, for which we give a complete satisfiability testing algorithm, w.r.t. available type information. We have implemented the analysis and integrated it in the CiaoPP system, which also infers automatically the mode and type information that our analysis takes as input. Experiments performed on this implementation show that the analysis is fairly accurate and efficient.
Resumo:
In this report we discuss some of the issues involved in the specialization and optimization of constraint logic programs with dynamic scheduling. Dynamic scheduling, as any other form of concurrency, increases the expressive power of constraint logic programs, but also introduces run-time overhead. The objective of the specialization and optimization is to reduce as much as possible such overhead automatically, while preserving the semantics of the original programs. This is done by program transformation based on global analysis. We present implementation techniques for this purpose and report on experimental results obtained from an implementation of the techniques in the context of the CIAO compiler.
Resumo:
This article presents in an informal way some early results on the design of a series of paradigms for visualization of the parallel execution of logic programs. The results presented here refer to the visualization of or-parallelism, as in MUSE and Aurora, deterministic dependent and-parallelism, as in Andorra-I, and independent and-parallelism as in &-Prolog. A tool has been implemented for this purpose and has been interfaced with these systems. Results are presented showing the visualization of executions from these systems and the usefulness of the resulting tool is briefly discussed.
Resumo:
The concept of independence has been recently generalized to the constraint logic programming (CLP) paradigm. Also, several abstract domains specifically designed for CLP languages, and whose information can be used to detect the generalized independence conditions, have been recently defined. As a result we are now in a position where automatic parallelization of CLP programs is feasible. In this paper we study the task of automatically parallelizing CLP programs based on such analyses, by transforming them to explicitly concurrent programs in our parallel CC platform (CIAO) as well as to AKL. We describe the analysis and transformation process, and study its efficiency, accuracy, and effectiveness in program parallelization. The information gathered by the analyzers is evaluated not only in terms of its accuracy, i.e. its ability to determine the actual dependencies among the program variables, but also of its effectiveness, measured in terms of code reduction in the resulting parallelized programs. Given that only a few abstract domains have been already defined for CLP, and that none of them were specifically designed for dependency detection, the aim of the evaluation is not only to asses the effectiveness of the available domains, but also to study what additional information it would be desirable to infer, and what domains would be appropriate for further improving the parallelization process.
Resumo:
Abstract interpretation-based data-flow analysis of logic programs is at this point relatively well understood from the point of view of general frameworks and abstract domains. On the other hand, comparatively little attention has been given to the problems which arise when analysis of a full, practical dialect of the Prolog language is attempted, and only few solutions to these problems have been proposed to date. Such problems relate to dealing correctly with all builtins, including meta-logical and extra-logical predicates, with dynamic predicates (where the program is modified during execution), and with the absence of certain program text during compilation. Existing proposals for dealing with such issues generally restrict in one way or another the classes of programs which can be analyzed if the information from analysis is to be used for program optimization. This paper attempts to fill this gap by considering a full dialect of Prolog, essentially following the recently proposed ISO standard, pointing out the problems that may arise in the analysis of such a dialect, and proposing a combination of known and novel solutions that together allow the correct analysis of arbitrary programs using the full power of the language.
Resumo:
Global analyzers traditionally read and analyze the entire program at once, in a non-incremental way. However, there are many situations which are not well suited to this simple model and which instead require reanalysis of certain parts of a program which has already been analyzed. In these cases, it appears inefficient to perform the analysis of the program again from scratch, as needs to be done with current systems. We describe how the fixpoint algorithms in current generic analysis engines can be extended to support incremental analysis. The possible changes to a program are classified into three types: addition, deletion, and arbitrary change. For each one of these, we provide one or more algorithms for identifying the parts of the analysis that must be recomputed and for performing the actual recomputation. The potential benefits and drawbacks of these algorithms are discussed. Finally, we present some experimental results obtained with an implementation of the algorithms in the PLAI generic abstract interpretation framework. The results show significant benefits when using the proposed incremental analysis algorithms.
Resumo:
In this paper, we examine the issue of memory management in the parallel execution of logic programs. We concentrate on non-deterministic and-parallel schemes which we believe present a relatively general set of problems to be solved, including most of those encountered in the memory management of or-parallel systems. We present a distributed stack memory management model which allows flexible scheduling of goals. Previously proposed models (based on the "Marker model") are lacking in that they impose restrictions on the selection of goals to be executed or they may require consume a large amount of virtual memory. This paper first presents results which imply that the above mentioned shortcomings can have significant performance impacts. An extension of the Marker Model is then proposed which allows flexible scheduling of goals while keeping (virtual) memory consumption down. Measurements are presented which show the advantage of this solution. Methods for handling forward and backward execution, cut and roll back are discussed in the context of the proposed scheme. In addition, the paper shows how the same mechanism for flexible scheduling can be applied to allow the efficient handling of the very general form of suspension that can occur in systems which combine several types of and-parallelism and more sophisticated methods of executing logic programs. We believe that the results are applicable to many and- and or-parallel systems.
Resumo:
Knowing the size of the terms to which program variables are bound at run-time in logic programs is required in a class of applications related to program optimization such as, for example, recursion elimination and granularity analysis. Such size is difficult to even approximate at compile time and is thus generally computed at run-time by using (possibly predefined) predicates which traverse the terms involved. We propose a technique based on program transformation which has the potential of performing this computation much more efficiently. The technique is based on finding program procedures which are called before those in which knowledge regarding term sizes is needed and which traverse the terms whose size is to be determined, and transforming such procedures so that they compute term sizes "on the fly". We present a systematic way of determining whether a given program can be transformed in order to compute a given term size at a given program point without additional term traversal. Also, if several such transformations are possible our approach allows finding minimal transformations under certain criteria. We also discuss the advantages and present some applications of our technique.
Resumo:
Bruynooghe described a framework for the top-down abstract interpretation of logic programs. In this framework, abstract interpretation is carried out by constructing an abstract and-or tree in a top-down fashion for a given query and program. Such an abstract interpreter requires fixpoint computation for programs which contain recursive predicates. This paper presents in detail a fixpoint algorithm that has been developed for this purpose and the motivation behind it. We start off by describing a simple-minded algorithm. After pointing out its shortcomings, we present a series of refinements to this algorithm, until we reach the final version. The aim is to give an intuitive grasp and provide justification for the relative complexity of the final algorithm. We also present an informal proof of correctness of the algorithm and some results obtained from an implementation.