169 resultados para Parallelizing Compilers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Domain-specific languages (DSLs) are increasingly used as embedded languages within general-purpose host languages. DSLs provide a compact, dedicated syntax for specifying parts of an application related to specialized domains. Unfortunately, such language extensions typically do not integrate well with the development tools of the host language. Editors, compilers and debuggers are either unaware of the extensions, or must be adapted at a non-trivial cost. We present a novel approach to embed DSLs into an existing host language by leveraging the underlying representation of the host language used by these tools. Helvetia is an extensible system that intercepts the compilation pipeline of the Smalltalk host language to seamlessly integrate language extensions. We validate our approach by case studies that demonstrate three fundamentally different ways to extend or adapt the host language syntax and semantics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

P-GENESIS is an extension to the GENESIS neural simulator that allows users to take advantage of parallel machines to speed up the simulation of their network models or concurrently simulate multiple models. P-GENESIS adds several commands to the GENESIS script language that let a script running on one processor execute remote procedure calls on other processors, and that let a script synchronize its execution with the scripts running on other processors. We present here some brief comments on the mechanisms underlying parallel script execution. We also offer advice on parallelizing parameter searches, partitioning network models, and selecting suitable parallel hardware on which to run P-GENESIS.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Despite the fact that input–output (IO) tables form a central part of the System of National Accounts, each individual country's national IO table exhibits more or less different features and characteristics, reflecting the country's socioeconomic idiosyncrasies. Consequently, the compilers of a multi-regional input–output table (MRIOT) are advised to thoroughly examine the conceptual as well as methodological differences among countries in the estimation of basic statistics for national IO tables and, if necessary, to carry out pre-adjustment of these tables into a common format prior to the MRIOT compilation. The objective of this study is to provide a practical guide for harmonizing national IO tables to construct a consistent MRIOT, referring to the adjustment practices used by the Institute of Developing Economies, JETRO (IDE-JETRO) in compiling the Asian International Input–Output Table.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We show a method for parallelizing top down dynamic programs in a straightforward way by a careful choice of a lock-free shared hash table implementation and randomization of the order in which the dynamic program computes its subproblems. This generic approach is applied to dynamic programs for knapsack, shortest paths, and RNA structure alignment, as well as to a state-of-the-art solution for minimizing the máximum number of open stacks. Experimental results are provided on three different modern multicore architectures which show that this parallelization is effective and reasonably scalable. In particular, we obtain over 10 times speedup for 32 threads on the open stacks problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report on a detailed study of the application and effectiveness of program analysis based on abstract interpretation to automatic program parallelization. We study the case of parallelizing logic programs using the notion of strict independence. We first propose and prove correct a methodology for the application in the parallelization task of the information inferred by abstract interpretation, using a parametric domain. The methodology is generic in the sense of allowing the use of different analysis domains. A number of well-known approximation domains are then studied and the transformation into the parametric domain defined. The transformation directly illustrates the relevance and applicability of each abstract domain for the application. Both local and global analyzers are then built using these domains and embedded in a complete parallelizing compiler. Then, the performance of the domains in this context is assessed through a number of experiments. A comparatively wide range of aspects is studied, from the resources needed by the analyzers in terms of time and memory to the actual benefits obtained from the information inferred. Such benefits are evaluated both in terms of the characteristics of the parallelized code and of the actual speedups obtained from it. The results show that data flow analysis plays an important role in achieving efficient parallelizations, and that the cost of such analysis can be reasonable even for quite sophisticated abstract domains. Furthermore, the results also offer significant insight into the characteristics of the domains, the demands of the application, and the trade-offs involved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Program specialization optimizes programs for known valúes of the input. It is often the case that the set of possible input valúes is unknown, or this set is infinite. However, a form of specialization can still be performed in such cases by means of abstract interpretation, specialization then being with respect to abstract valúes (substitutions), rather than concrete ones. We study the múltiple specialization of logic programs based on abstract interpretation. This involves in principie, and based on information from global analysis, generating several versions of a program predicate for different uses of such predicate, optimizing these versions, and, finally, producing a new, "multiply specialized" program. While múltiple specialization has received theoretical attention, little previous evidence exists on its practicality. In this paper we report on the incorporation of múltiple specialization in a parallelizing compiler and quantify its effects. A novel approach to the design and implementation of the specialization system is proposed. The resulting implementation techniques result in identical specializations to those of the best previously proposed techniques but require little or no modification of some existing abstract interpreters. Our results show that, using the proposed techniques, the resulting "abstract múltiple specialization" is indeed a relevant technique in practice. In particular, in the parallelizing compiler application, a good number of run-time tests are eliminated and invariants extracted automatically from loops, resulting generally in lower overheads and in several cases in increased speedups.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modeling the evolution of the state of program memory during program execution is critical to many parallehzation techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs at compile time, which can accurately determine many important program properties such as aliasing, logical data structures and shape. These properties are known to be critical for transforming a single threaded program into a versión that can be run on múltiple execution units in parallel. The analysis is shown to be of polynomial complexity in the size of the memory heap. Experimental results for benchmarks in the Jolden suite are given. These results show that in practice the analysis method is efflcient and is capable of accurately determining shape information in programs that créate and manipúlate complex data structures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we study, through a concrete case, the feasibility of using a high-level, general-purpose logic language in the design and implementation of applications targeting wearable computers. The case study is a "sound spatializer" which, given real-time signáis for monaural audio and heading, generates stereo sound which appears to come from a position in space. The use of advanced compile-time transformations and optimizations made it possible to execute code written in a clear style without efñciency or architectural concerns on the target device, while meeting strict existing time and memory constraints. The final executable compares favorably with a similar implementation written in C. We believe that this case is representative of a wider class of common pervasive computing applications, and that the techniques we show here can be put to good use in a range of scenarios. This points to the possibility of applying high-level languages, with their associated flexibility, conciseness, ability to be automatically parallelized, sophisticated compile-time tools for analysis and verification, etc., to the embedded systems field without paying an unnecessary performance penalty.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The integration of powerful partial evaluation methods into practical compilers for logic programs is still far from reality. This is related both to 1) efficiency issues and to 2) the complications of dealing with practical programs. Regarding efnciency, the most successful unfolding rules used nowadays are based on structural orders applied over (covering) ancestors, i.e., a subsequence of the atoms selected during a derivation. Unfortunately, maintaining the structure of the ancestor relation during unfolding introduces significant overhead. We propose an efficient, practical local unfolding rule based on the notion of covering ancestors which can be used in combination with any structural order and allows a stack-based implementation without losing any opportunities for specialization. Regarding the second issue, we propose assertion-based techniques which allow our approach to deal with real programs that include (Prolog) built-ins and external predicates in a very extensible manner. Finally, we report on our implementation of these techniques in a practical partial evaluator, embedded in a state of the art compiler which uses global analysis extensively (the Ciao compiler and, specifically, its preprocessor CiaoPP). The performance analysis of the resulting system shows that our techniques, in addition to dealing with practical programs, are also significantly more efficient in time and somewhat more efficient in memory than traditional tree-based implementations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We study the múltiple specialization of logic programs based on abstract interpretation. This involves in general generating several versions of a program predícate for different uses of such predícate, making use of information obtained from global analysis performed by an abstract interpreter, and finally producing a new, "multiply specialized" program. While the topic of múltiple specialization of logic programs has received considerable theoretical attention, it has never been actually incorporated in a compiler and its effects quantified. We perform such a study in the context of a parallelizing compiler and show that it is indeed a relevant technique in practice. Also, we propose an implementation technique which has the same power as the strongest of the previously proposed techniques but requires little or no modification of an existing abstract interpreter.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a study of the effectiveness of global analysis in the parallelization of logic programs using strict independence. A number of well-known approximation domains are selected and tlieir usefulness for the application in hand is explained. Also, methods for using the information provided by such domains to improve parallelization are proposed. Local and global analyses are built using these domains and such analyses are embedded in a complete parallelizing compiler. Then, the performance of the domains (and the system in general) is assessed for this application through a number of experiments. We argüe that the results offer significant insight into the characteristics of these domains, the demands of the application, and the tradeoffs involved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a study of the effectiveness of three different algorithms for the parallelization of logic programs based on compile-time detection of independence among goals. The algorithms are embedded in a complete parallelizing compiler, which incorporates different abstract interpretation-based program analyses. The complete system shows the task of automatic program parallelization to be practical. The trade-offs involved in using each of the algorithms in this task are studied experimentally, weaknesses of these identified, and possible improvements discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Global data-flow analysis of (constraint) logic programs, which is generally based on abstract interpretation [7], is reaching a comparatively high level of maturity. A natural question is whether it is time for its routine incorporation in standard compilers, something which, beyond a few experimental systems, has not happened to date. Such incorporation arguably makes good sense only if: • the range of applications of global analysis is large enough to justify the additional complication in the compiler, and • global analysis technology can deal with all the features of "practical" languages (e.g., the ISO-Prolog built-ins) and "scales up" for large programs. We present a tutorial overview of a number of concepts and techniques directly related to the issues above, with special emphasis on the first one. In particular, we concéntrate on novel uses of global analysis during program development and debugging, rather than on the more traditional application área of program optimization. The idea of using abstract interpretation for validation and diagnosis has been studied in the context of imperative programming [2] and also of logic programming. The latter work includes issues such as using approximations to reduce the burden posed on programmers by declarative debuggers [6, 3] and automatically generating and checking assertions [4, 5] (which includes the more traditional type checking of strongly typed languages, such as Gódel or Mercury [1, 8, 9]) We also review some solutions for scalability including modular analysis, incremental analysis, and widening. Finally, we discuss solutions for dealing with meta-predicates, side-effects, delay declarations, constraints, dynamic predicates, and other such features which may appear in practical languages. In the discussion we will draw both from the literature and from our experience and that of others in the development and use of the CIAO system analyzer. In order to emphasize the practical aspects of the solutions discussed, the presentation of several concepts will be illustrated by examples run on the CIAO system, which makes extensive use of global analysis and assertions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a conditional parallelization process for and-parallelism based on the notion of non-strict independence, a more relaxed notion than the traditional of strict independence. By using this notion, a parallelism annotator can extract more parallelism from programs. On the other hand, the intrinsic complexity of non-strict independence poses new challenges to this task. We report here on the implementation we have accomplished of an annotator for non-strict independence, capable of producing both static and dynamic execution graphs. This implementation, along with the also implemented independence checker and their integration in our system, have resulted what is, to the best of our knowledge, the first parallelizing compiler based on nonstrict independence which produces dynamic execution graphs. The paper also presents a preliminary assessment of the implemented tools, comparing them with the existing ones for strict independence, which shows encouraging results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The concept of independence has been recently generalized to the constraint logic programming (CLP) paradigm. Also, several abstract domains specifically designed for CLP languages, and whose information can be used to detect the generalized independence conditions, have been recently defined. As a result we are now in a position where automatic parallelization of CLP programs is feasible. In this paper we study the task of automatically parallelizing CLP programs based on such analyses, by transforming them to explicitly concurrent programs in our parallel CC platform (CIAO) as well as to AKL. We describe the analysis and transformation process, and study its efficiency, accuracy, and effectiveness in program parallelization. The information gathered by the analyzers is evaluated not only in terms of its accuracy, i.e. its ability to determine the actual dependencies among the program variables, but also of its effectiveness, measured in terms of code reduction in the resulting parallelized programs. Given that only a few abstract domains have been already defined for CLP, and that none of them were specifically designed for dependency detection, the aim of the evaluation is not only to asses the effectiveness of the available domains, but also to study what additional information it would be desirable to infer, and what domains would be appropriate for further improving the parallelization process.