109 resultados para parallel computation
Resumo:
In recent years a lot of research has been invested in parallel processing of numerical applications. However, parallel processing of Symbolic and AI applications has received less attention. This paper presents a system for parallel symbolic computitig, narned ACE, based on the logic programming paradigm. ACE is a computational model for the full Prolog language, capable of exploiting Or-parall< lism and Independent And-parallelism. In this paper vve focus on the implementation of the and-parallel part of the ACE system (ralled &ACE) on a shared memory multiprocessor, d< scribing its organization, some optimizations, and presenting some performance figures, proving the abilhy of &ACE to efficiently exploit parallelism.
Resumo:
This paper presents a study of the effectiveness of three different algorithms for the parallelization of logic programs based on compile-time detection of independence among goals. The algorithms are embedded in a complete parallelizing compiler, which incorporates different abstract interpretation-based program analyses. The complete system shows the task of automatic program parallelization to be practical. The trade-offs involved in using each of the algorithms in this task are studied experimentally, weaknesses of these identified, and possible improvements discussed.
Resumo:
Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also, a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with médium to large parallel execution overheads.
Resumo:
In this paper we present a novel execution model for parallel implementation of logic programs which is capable of exploiting both independent and-parallelism and or-parallelism in an efficient way. This model extends the stack copying approach, which has been successfully applied in the Muse system to implement or-parallelism, by integrating it with proven techniques used to support independent and-parallelism. We show how all solutions to non-deterministic andparallel goals are found without repetitions. This is done through recomputation as in Prolog (and in various and-parallel systems, like &-Prolog and DDAS), i.e., solutions of and-parallel goals are not shared. We propose a scheme for the efficient management of the address space in a way that is compatible with the apparently incompatible requirements of both and- and or-parallelism. We also show how the full Prolog language, with all its extra-logical features, can be supported in our and-or parallel system so that its sequential semantics is preserved. The resulting system retains the advantages of both purely or-parallel systems as well as purely and-parallel systems. The stack copying scheme together with our proposed memory management scheme can also be used to implement models that combine dependent and-parallelism and or-parallelism, such as Andorra and Prometheus.
Resumo:
We informally discuss several issues related to the parallel execution of logic programming systems and concurrent logic programming systems, and their generalization to constraint programming. We propose a new view of these systems, based on a particular definition of parallelism. We argüe that, under this view, a large number of the actual systems and models can be explained through the application, at different levéis of granularity, of only a few basic principies: determinism, non-failure, independence (also referred to as stability), granularity, etc. Also, and based on the convergence of concepts that this view brings, we sketch a model for the implementation of several parallel constraint logic programming source languages and models based on a common, generic abstract machine and an intermedíate kernel language.
Resumo:
This paper addresses the design of visual paradigms for observing the parallel execution of logic programs. First, an intuitive method is proposed for arriving at the design of a paradigm and its implementation as a tool for a given model of parallelism. This method is based on stepwise reñnement starting from the deñnition of basic notions such as events and observables and some precedence relationships among events which hold for the given model of parallelism. The method is then applied to several types of parallel execution models for logic programs (Orparallelism, Determinate Dependent And parallelism, Restricted and-parallelism) for which visualization paradigms are designed. Finally, VisAndOr, a tool which implements all of these paradigms is presented, together with a discussion of its usefulness through examples.
Resumo:
We present a parallel graph narrowing machine, which is used to implement a functional logic language on a shared memory multiprocessor. It is an extensión of an abstract machine for a purely functional language. The result is a programmed graph reduction machine which integrates the mechanisms of unification, backtracking, and independent and-parallelism. In the machine, the subexpressions of an expression can run in parallel. In the case of backtracking, the structure of an expression is used to avoid the reevaluation of subexpressions as far as possible. Deterministic computations are detected. Their results are maintained and need not be reevaluated after backtracking.
Resumo:
We argüe that in order to exploit both Independent And- and Or-parallelism in Prolog programs there is advantage in recomputing some of the independent goals, as opposed to all their solutions being reused. We present an abstract model, called the Composition- Tree, for representing and-or parallelism in Prolog Programs. The Composition-tree closely mirrors sequential Prolog execution by recomputing some independent goals rather than fully re-using them. We also outline two environment representation techniques for And-Or parallel execution of full Prolog based on the Composition-tree model abstraction. We argüe that these techniques have advantages over earlier proposals for exploiting and-or parallelism in Prolog.
Resumo:
The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds significantly beyond those of sequential systems, while supporting Prolog semantics and preserving sequential performance and storage efficiency. This paper presents simulation results supporting these claims with special emphasis on memory performance on a two-level sharedmemory multiprocessor organization. Several solutions to the cache coherency problem are analyzed. It is shown that RAP-WAM offers good locality and storage efficiency and that it can effectively take advantage of broadcast caches. It is argued that speeds in excess of 2 ML IPS on real applications exhibiting medium parallelism can be attained with current technology.
Resumo:
The interactions among three important issues involved in the implementation of logic programs in parallel (goal scheduling, precedence, and memory management) are discussed. A simplified, parallel memory management model and an efficient, load-balancing goal scheduling strategy are presented. It is shown how, for systems which support "don't know" non-determinism, special care has to be taken during goal scheduling if the space recovery characteristics of sequential systems are to be preserved. A solution based on selecting only "newer" goals for execution is described, and an algorithm is proposed for efficiently maintaining and determining precedence relationships and variable ages across parallel goals. It is argued that the proposed schemes and algorithms make it possible to extend the storage performance of sequential systems to parallel execution without the considerable overhead previously associated with it. The results are applicable to a wide class of parallel and coroutining systems, and they represent an efficient alternative to "all heap" or "spaghetti stack" allocation models.
Resumo:
Although the sequential execution speed of logic programs has been greatly improved by the concepts introduced in the Warren Abstract Machine (WAM), parallel execution represents the only way to increase this speed beyond the natural limits of sequential systems. However, most proposed parallel logic programming execution models lack the performance optimizations and storage efficiency of sequential systems. This paper presents a parallel abstract machine which is an extension of the WAM and is thus capable of supporting ANDParallelism without giving up the optimizations present in sequential implementations. A suitable instruction set, which can be used as a target by a variety of logic programming languages, is also included. Special instructions are provided to support a generalized version of "Restricted AND-Parallelism" (RAP), a technique which reduces the overhead traditionally associated with the run-time management of variable binding conflicts to a series of simple run-time checks, which select one out of a series of compiled execution graphs.
Resumo:
We propose a computational methodology -"B-LOG"-, which offers the potential for an effective implementation of Logic Programming in a parallel computer. We also propose a weighting scheme to guide the search process through the graph and we apply the concepts of parallel "branch and bound" algorithms in order to perform a "best-first" search using an information theoretic bound. The concept of "session" is used to speed up the search process in a succession of similar queries. Within a session, we strongly modify the bounds in a local database, while bounds kept in a global database are weakly modified to provide a better initial condition for other sessions. We also propose an implementation scheme based on a database machine using "semantic paging", and the "B-LOG processor" based on a scoreboard driven controller.
Resumo:
We present an overview of the stack-based memory management techniques that we used in our non-deterministic and-parallel Prolog systems: &-Prolog and DASWAM. We believe that the problems associated with non-deterministic and-parallel systems are more general than those encountered in or-parallel and deterministic and-parallel systems, which can be seen as subsets of this more general case. We develop on the previously proposed "marker scheme", lifting some of the restrictions associated with the selection of goals while keeping (virtual) memory consumption down. We also review some of the other problems associated with the stack-based management scheme, such as handling of forward and backward execution, cut, and roll-backs.
Resumo:
An abstract is not available.
Resumo:
Effective static analyses have been proposed which infer bounds on the number of resolutions or reductions. These have the advantage of being independent from the platform on which the programs are executed and have been shown to be useful in a number of applications, such as granularity control in parallel execution. On the other hand, in distributed computation scenarios where platforms with different capabilities come into play, it is necessary to express costs in metrics that include the characteristics of the platform. In particular, it is specially interesting to be able to infer upper and lower bounds on actual execution times. With this objective in mind, we propose an approach which combines compile-time analysis for cost bounds with a one-time profiling of the platform in order to determine the valúes of certain parameters for a given platform. These parameters calíbrate a cost model which, from then on, is able to compute statically time bound functions for procedures and to predict with a significant degree of accuracy the execution times of such procedures in the given platform. The approach has been implemented and integrated in the CiaoPP system.