933 resultados para PARALLEL COMPUTING


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivated by accurate average-case analysis, MOdular Quantitative Analysis (MOQA) is developed at the Centre for Efficiency Oriented Languages (CEOL). In essence, MOQA allows the programmer to determine the average running time of a broad class of programmes directly from the code in a (semi-)automated way. The MOQA approach has the property of randomness preservation which means that applying any operation to a random structure, results in an output isomorphic to one or more random structures, which is key to systematic timing. Based on original MOQA research, we discuss the design and implementation of a new domain specific scripting language based on randomness preserving operations and random structures. It is designed to facilitate compositional timing by systematically tracking the distributions of inputs and outputs. The notion of a labelled partial order (LPO) is the basic data type in the language. The programmer uses built-in MOQA operations together with restricted control flow statements to design MOQA programs. This MOQA language is formally specified both syntactically and semantically in this thesis. A practical language interpreter implementation is provided and discussed. By analysing new algorithms and data restructuring operations, we demonstrate the wide applicability of the MOQA approach. Also we extend MOQA theory to a number of other domains besides average-case analysis. We show the strong connection between MOQA and parallel computing, reversible computing and data entropy analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND: Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. RESULTS: We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. CONCLUSIONS: permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computer Aided Parallelisation Tools (CAPTools) is a toolkit designed to automate as much as possible of the process of parallelising scalar FORTRAN 77 codes. The toolkit combines a very powerful dependence analysis together with user supplied knowledge to build an extremely comprehensive and accurate dependence graph. The initial version has been targeted at structured mesh computational mechanics codes (eg. heat transfer, Computational Fluid Dynamics (CFD)) and the associated simple mesh decomposition paradigm is utilised in the automatic code partition, execution control mask generation and communication call insertion. In this, the first of a series of papers [1–3] the authors discuss the parallelisations of a number of case study codes showing how the various component tools may be used to develop a highly efficient parallel implementation in a few hours or days. The details of the parallelisation of the TEAMKE1 CFD code are described together with the results of three other numerical codes. The resulting parallel implementations are then tested on workstation clusters using PVM and an i860-based parallel system showing efficiencies well over 80%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The requirement for a very accurate dependence analysis to underpin software tools to aid the generation of efficient parallel implementations of scalar code is argued. The current status of dependence analysis is shown to be inadequate for the generation of efficient parallel code, causing too many conservative assumptions to be made. This paper summarises the limitations of conventional dependence analysis techniques, and then describes a series of extensions which enable the production of a much more accurate dependence graph. The extensions include analysis of symbolic variables, the development of a symbolic inequality disproof algorithm and its exploitation in a symbolic Banerjee inequality test; the use of inference engine proofs; the exploitation of exact dependence and dependence pre-domination attributes; interprocedural array analysis; conditional variable definition tracing; integer array tracing and division calculations. Analysis case studies on typical numerical code is shown to reduce the total dependencies estimated from conventional analysis by up to 50%. The techniques described in this paper have been embedded within a suite of tools, CAPTools, which combines analysis with user knowledge to produce efficient parallel implementations of numerical mesh based codes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper addresses the exploitation of overlapping communication with calculation within parallel FORTRAN 77 codes for computational fluid dynamics (CFD) and computational structured dynamics (CSD). The obvious objective is to overlap interprocessor communication with calculation on each processor in a distributed memory parallel system and so improve the efficiency of the parallel implementation. A general strategy for converting synchronous to overlapped communication is presented together with tools to enable its automatic implementation in FORTRAN 77 codes. This strategy is then implemented within the parallelisation toolkit, CAPTools, to facilitate the automatic generation of parallel code with overlapped communications. The success of these tools are demonstrated on two codes from the NAS-PAR and PERFECT benchmark suites. In each case, the tools produce parallel code with overlapped communications which is as good as that which could be generated manually. The parallel performance of the codes also improve in line with expectation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The most common parallelisation strategy for many Computational Mechanics (CM) (typified by Computational Fluid Dynamics (CFD) applications) which use structured meshes, involves a 1D partition based upon slabs of cells. However, many CFD codes employ pipeline operations in their solution procedure. For parallelised versions of such codes to scale well they must employ two (or more) dimensional partitions. This paper describes an algorithmic approach to the multi-dimensional mesh partitioning in code parallelisation, its implementation in a toolkit for almost automatically transforming scalar codes to parallel form, and its testing on a range of ‘real-world’ FORTRAN codes. The concept of multi-dimensional partitioning is straightforward, but non-trivial to represent as a sufficiently generic algorithm so that it can be embedded in a code transformation tool. The results of the tests on fine real-world codes demonstrate clear improvements in parallel performance and scalability (over a 1D partition). This is matched by a huge reduction in the time required to develop the parallel versions when hand coded – from weeks/months down to hours/days.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The manufacture of materials products involves the control of a range of interacting physical phenomena. The material to be used is synthesised and then manipulated into some component form. The structure and properties of the final component are influenced by both interactions of continuum-scale phenomena and those at an atomistic-scale level. Moreover, during the processing phase there are some properties that cannot be measured (typically the liquid-solid phase change). However, it seems there is a potential to derive properties and other features from atomistic-scale simulations that are of key importance at the continuum scale. Some of the issues that need to be resolved in this context focus upon computational techniques and software tools facilitating: (i) the multiphysics modeling at continuum scale; (ii) the interaction and appropriate degrees of coupling between the atomistic through microstructure to continuum scale; and (iii) the exploitation of high-performance parallel computing power delivering simulation results in a practical time period. This paper discusses some of the attempts to address each of the above issues, particularly in the context of materials processing for manufacture.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computer based mathematical models describing aircraft fire have a role to play in the design and development of safer aircraft, in the implementation of safer and more rigorous certification criteria and in post mortuum accident investigation. As the cost involved in performing large-scale fire experiments for the next generation 'Ultra High Capacity Aircraft' (UHCA) are expected to be prohibitively high, the development and use of these modelling tools may become essential if these aircraft are to prove a safe and viable reality. By describing the present capabilities and limitations of aircraft fire models, this paper will examine the future development of these models in the areas of large scale applications through parallel computing, combustion modelling and extinguishment modelling.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This chapter discusses the code parallelization environment, where a number of tools that address the main tasks, such as code parallelization, debugging, and optimization are available. The parallelization tools include ParaWise and CAPO, which enable the near automatic parallelization of real world scientific application codes for shared and distributed memory-based parallel systems. The chapter discusses the use of ParaWise and CAPO to transform the original serial code into an equivalent parallel code that contains appropriate OpenMP directives. Additionally, as user involvement can introduce errors, a relative debugging tool (P2d2) is also available and can be used to perform near automatic relative debugging of an OpenMP program that has been parallelized either using the tools or manually. In order for these tools to be effective in parallelizing a range of applications, a high quality fully inter-procedural dependence analysis, as well as user interaction is vital to the generation of efficient parallel code and in the optimization of the backtracking and speculation process used in relative debugging. Results of parallelized NASA codes are discussed and show the benefits of using the environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Code parallelization using OpenMP for shared memory systems is relatively easier than using message passing for distributed memory systems. Despite this, it is still a challenge to use OpenMP to parallelize application codes in a way that yields effective scalable performance when executed on a shared memory parallel system. We describe an environment that will assist the programmer in the various tasks of code parallelization and this is achieved in a greatly reduced time frame and level of skill required. The parallelization environment includes a number of tools that address the main tasks of parallelism detection, OpenMP source code generation, debugging and optimization. These tools include a high quality, fully interprocedural dependence analysis with user interaction capabilities to facilitate the generation of efficient parallel code, an automatic relative debugging tool to identify erroneous user decisions in that interaction and also performance profiling to identify bottlenecks. Finally, experiences of parallelizing some NASA application codes are presented to illustrate some of the benefits of using the evolving environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The utilization of the computational Grid processor network has become a common method for researchers and scientists without access to local processor clusters to avail of the benefits of parallel processing for compute-intensive applications. As a result, this demand requires effective and efficient dynamic allocation of available resources. Although static scheduling and allocation techniques have proved effective, the dynamic nature of the Grid requires innovative techniques for reacting to change and maintaining stability for users. The dynamic scheduling process requires quite powerful optimization techniques, which can themselves lack the performance required in reaction time for achieving an effective schedule solution. Often there is a trade-off between solution quality and speed in achieving a solution. This paper presents an extension of a technique used in optimization and scheduling which can provide the means of achieving this balance and improves on similar approaches currently published.