Biblioteca Digital

11 resultados para Fortran

em Greenwich Academic Literature Archive - UK

Integrating user knowledge with information from parallelisation tools to facilitate the automatic generation of efficient parallel FORTRAN code

Relevância:

20.00% 20.00%

Publicador:

Resumo:

User supplied knowledge and interaction is a vital component of a toolkit for producing high quality parallel implementations of scalar FORTRAN numerical code. In this paper we consider the necessary components that such a parallelisation toolkit should possess to provide an effective environment to identify, extract and embed user relevant user knowledge. We also examine to what extent these facilities are available in leading parallelisation tools; in particular we discuss how these issues have been addressed in the development of the user interface of the Computer Aided Parallelisation Tools (CAPTools). The CAPTools environment has been designed to enable user exploration, interaction and insertion of user knowledge to facilitate the automatic generation of very efficient parallel code. A key issue in the user's interaction is control of the volume of information so that the user is focused on only that which is needed. User control over the level and extent of information revealed at any phase is supplied using a wide variety of filters. Another issue is the way in which information is communicated. Dependence analysis and its resulting graphs involve a lot of sophisticated rather abstract concepts unlikely to be familiar to most users of parallelising tools. As such, considerable effort has been made to communicate with the user in terms that they will understand. These features, amongst others, and their use in the parallelisation process are described and their effectiveness discussed.

Veja mais

Software tools for automating the parallelisation of FORTRAN computational mechanics codes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is now clear that the concept of a HPC compiler which automatically produces highly efficient parallel implementations is a pipe-dream. Another route is to recognise from the outset that user information is required and to develop tools that embed user interaction in the transformation of code from scalar to parallel form, and then use conventional compilers with a set of communication calls. This represents the key idea underlying the development of the CAPTools software environment. The initial version of CAPTools is focused upon single block structured mesh computational mechanics codes. The capability for unstructured mesh codes is under test now and block structured meshes will be included next. The parallelisation process can be completed rapidly for modest codes and the parallel performance approaches that which is delivered by hand parallelisations.

Veja mais

Computer aided parallelisation tools (CAPTools) — conceptual overview and performance on the parallelisation of structured mesh codes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Computer Aided Parallelisation Tools (CAPTools) is a toolkit designed to automate as much as possible of the process of parallelising scalar FORTRAN 77 codes. The toolkit combines a very powerful dependence analysis together with user supplied knowledge to build an extremely comprehensive and accurate dependence graph. The initial version has been targeted at structured mesh computational mechanics codes (eg. heat transfer, Computational Fluid Dynamics (CFD)) and the associated simple mesh decomposition paradigm is utilised in the automatic code partition, execution control mask generation and communication call insertion. In this, the first of a series of papers [1–3] the authors discuss the parallelisations of a number of case study codes showing how the various component tools may be used to develop a highly efficient parallel implementation in a few hours or days. The details of the parallelisation of the TEAMKE1 CFD code are described together with the results of three other numerical codes. The resulting parallel implementations are then tested on workstation clusters using PVM and an i860-based parallel system showing efficiencies well over 80%.

Veja mais

Automatic code generation of overlapped communications in a parallelisation tool

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper addresses the exploitation of overlapping communication with calculation within parallel FORTRAN 77 codes for computational fluid dynamics (CFD) and computational structured dynamics (CSD). The obvious objective is to overlap interprocessor communication with calculation on each processor in a distributed memory parallel system and so improve the efficiency of the parallel implementation. A general strategy for converting synchronous to overlapped communication is presented together with tools to enable its automatic implementation in FORTRAN 77 codes. This strategy is then implemented within the parallelisation toolkit, CAPTools, to facilitate the automatic generation of parallel code with overlapped communications. The success of these tools are demonstrated on two codes from the NAS-PAR and PERFECT benchmark suites. In each case, the tools produce parallel code with overlapped communications which is as good as that which could be generated manually. The parallel performance of the codes also improve in line with expectation.

Veja mais

Automatic and effective multi-dimensional parallelisation of structured mesh based codes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The most common parallelisation strategy for many Computational Mechanics (CM) (typified by Computational Fluid Dynamics (CFD) applications) which use structured meshes, involves a 1D partition based upon slabs of cells. However, many CFD codes employ pipeline operations in their solution procedure. For parallelised versions of such codes to scale well they must employ two (or more) dimensional partitions. This paper describes an algorithmic approach to the multi-dimensional mesh partitioning in code parallelisation, its implementation in a toolkit for almost automatically transforming scalar codes to parallel form, and its testing on a range of ‘real-world’ FORTRAN codes. The concept of multi-dimensional partitioning is straightforward, but non-trivial to represent as a sufficiently generic algorithm so that it can be embedded in a code transformation tool. The results of the tests on fine real-world codes demonstrate clear improvements in parallel performance and scalability (over a 1D partition). This is matched by a huge reduction in the time required to develop the parallel versions when hand coded – from weeks/months down to hours/days.

Veja mais

Comparison of finite element and finite volume methods application in geometrically nonlinear stress analysis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A novel three-dimensional finite volume (FV) procedure is described in detail for the analysis of geometrically nonlinear problems. The FV procedure is compared with the conventional finite element (FE) Galerkin approach. FV can be considered to be a particular case of the weighted residual method with a unit weighting function, where in the FE Galerkin method we use the shape function as weighting function. A Fortran code has been developed based on the finite volume cell vertex formulation. The formulation is tested on a number of geometrically nonlinear problems. In comparison with FE, the results reveal that FV can reach the FE results in a higher mesh density.

Veja mais

Portability, predictability and performance for parallel computing: BSP in practice

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report on practical experience using the Oxford BSP Library to parallelize a large electromagnetic code, the British Aerospace finite-difference time-domain code EMMA T:FD3D. The Oxford BS Library is one of the first realizations of the Bulk Synchronous Parallel computational model to be targeted at numerically intensive scientific (typically Fortran) computing. The BAe EMMA code is one of the first large-scale applications to be parallelized using this library, and it is an important demonstration of the cost effectiveness of the BSP approach. We illustrate how BSP cost-modelling techniques can be used to predict and optimize performance for single-source programs across different parallel platforms. We provide predicted and observed performance figures for an industrial-strength, single-source parallel code for a variety of real parallel architectures: shared memory multiprocessors, workstation clusters and massively parallel platforms.

Veja mais

CAPLib - A 'thin layer' message passing library to support computational mechanics codes on distributed memory parallel systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Computer Aided Parallelisation Tools (CAPTools) [Ierotheou, C, Johnson SP, Cross M, Leggett PF, Computer aided parallelisation tools (CAPTools)-conceptual overview and performance on the parallelisation of structured mesh codes, Parallel Computing, 1996;22:163±195] is a set of interactive tools aimed to provide automatic parallelisation of serial FORTRAN Computational Mechanics (CM) programs. CAPTools analyses the user's serial code and then through stages of array partitioning, mask and communication calculation, generates parallel SPMD (Single Program Multiple Data) messages passing FORTRAN. The parallel code generated by CAPTools contains calls to a collection of routines that form the CAPTools communications Library (CAPLib). The library provides a portable layer and user friendly abstraction over the underlying parallel environment. CAPLib contains optimised message passing routines for data exchange between parallel processes and other utility routines for parallel execution control, initialisation and debugging. By compiling and linking with different implementations of the library, the user is able to run on many different parallel environments. Even with today's parallel systems the concept of a single version of a parallel application code is more of an aspiration than a reality. However for CM codes the data partitioning SPMD paradigm requires a relatively small set of message-passing communication calls. This set can be implemented as an intermediate `thin layer' library of message-passing calls that enables the parallel code (especially that generated automatically by a parallelisation tool such as CAPTools) to be as generic as possible. CAPLib is just such a `thin layer' message passing library that supports parallel CM codes, by mapping generic calls onto machine specific libraries (such as CRAY SHMEM) and portable general purpose libraries (such as PVM an MPI). This paper describe CAPLib together with its three perceived advantages over other routes: - as a high level abstraction, it is both easy to understand (especially when generated automatically by tools) and to implement by hand, for the CM community (who are not generally parallel computing specialists); - the one parallel version of the application code is truly generic and portable; - the parallel application can readily utilise whatever message passing libraries on a given machine yield optimum performance.

Veja mais

Using an interactive software environment for the parallelization of real-world scientific applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The parallelization of real-world compute intensive Fortran application codes is generally not a trivial task. If the time to complete the parallelization is to be significantly reduced then an environment is needed that will assist the programmer in the various tasks of code parallelization. In this paper the authors present a code parallelization environment where a number of tools that address the main tasks such as code parallelization, debugging and optimization are available. The ParaWise and CAPO parallelization tools are discussed which enable the near automatic parallelization of real-world scientific application codes for shared and distributed memory-based parallel systems. As user involvement in the parallelization process can introduce errors, a relative debugging tool (P2d2) is also available and can be used to perform nearly automatic relative debugging of a program that has been parallelized using the tools. A high quality interprocedural dependence analysis as well as user-tool interaction are also highlighted and are vital to the generation of efficient parallel code and in the optimization of the backtracking and speculation process used in relative debugging. Results of benchmark and real-world application codes parallelized are presented and show the benefits of using the environment

Veja mais

Comparison of three algorithms for nonlinear metal cutting problems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper compares three alternative numerical algorithms applied to a nonlinear metal cutting problem. One algorithm is based on an explicit method and the other two are implicit. Domain decomposition (DD) is used to break the original domain into subdomains, each containing a properly connected, well-formulated and continuous subproblem. The serial version of the explicit algorithm is implemented in FORTRAN and its parallel version uses MPI (Message Passing Interface) calls. One implicit algorithm is implemented by coupling the state-of-the-art PETSc (Portable, Extensible Toolkit for Scientific Computation) software with in-house software in order to solve the subproblems. The second implicit algorithm is implemented completely within PETSc. PETSc uses MPI as the underlying communication library. Finally, a 2D example is used to test the algorithms and various comparisons are made.

Veja mais

A generic strategy for dynamic load balancing of distributed memory parallel computational mechanics using unstructured meshes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.

Veja mais

11 resultados para Fortran

em Greenwich Academic Literature Archive - UK

Filtro por publicador