16 resultados para Automatic Query Refinement

em Greenwich Academic Literature Archive - UK


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The availability of a very accurate dependence graph for a scalar code is the basis for the automatic generation of an efficient parallel implementation. The strategy for this task which is encapsulated in a comprehensive data partitioning code generation algorithm is described. This algorithm involves the data partition, calculation of assignment ranges for partitioned arrays, addition of a comprehensive set of execution control masks, altering loop limits, addition and optimisation of communications for all data. In this context, the development and implementation of strategies to merge communications wherever possible has proved an important feature in producing efficient parallel implementations for numerical mesh based codes. The code generation strategies described here are embedded within the Computer Aided Parallelisation tools (CAPTools) software as a key part of a toolkit for automating as much as possible of the parallelisation process for mesh based numerical codes. The algorithms used enables parallelisation of real computational mechanics codes with only minor user interaction and without any prior manual customisation of the serial code to suit the parallelisation tool.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

User supplied knowledge and interaction is a vital component of a toolkit for producing high quality parallel implementations of scalar FORTRAN numerical code. In this paper we consider the necessary components that such a parallelisation toolkit should possess to provide an effective environment to identify, extract and embed user relevant user knowledge. We also examine to what extent these facilities are available in leading parallelisation tools; in particular we discuss how these issues have been addressed in the development of the user interface of the Computer Aided Parallelisation Tools (CAPTools). The CAPTools environment has been designed to enable user exploration, interaction and insertion of user knowledge to facilitate the automatic generation of very efficient parallel code. A key issue in the user's interaction is control of the volume of information so that the user is focused on only that which is needed. User control over the level and extent of information revealed at any phase is supplied using a wide variety of filters. Another issue is the way in which information is communicated. Dependence analysis and its resulting graphs involve a lot of sophisticated rather abstract concepts unlikely to be familiar to most users of parallelising tools. As such, considerable effort has been made to communicate with the user in terms that they will understand. These features, amongst others, and their use in the parallelisation process are described and their effectiveness discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the exploitation of overlapping communication with calculation within parallel FORTRAN 77 codes for computational fluid dynamics (CFD) and computational structured dynamics (CSD). The obvious objective is to overlap interprocessor communication with calculation on each processor in a distributed memory parallel system and so improve the efficiency of the parallel implementation. A general strategy for converting synchronous to overlapped communication is presented together with tools to enable its automatic implementation in FORTRAN 77 codes. This strategy is then implemented within the parallelisation toolkit, CAPTools, to facilitate the automatic generation of parallel code with overlapped communications. The success of these tools are demonstrated on two codes from the NAS-PAR and PERFECT benchmark suites. In each case, the tools produce parallel code with overlapped communications which is as good as that which could be generated manually. The parallel performance of the codes also improve in line with expectation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The most common parallelisation strategy for many Computational Mechanics (CM) (typified by Computational Fluid Dynamics (CFD) applications) which use structured meshes, involves a 1D partition based upon slabs of cells. However, many CFD codes employ pipeline operations in their solution procedure. For parallelised versions of such codes to scale well they must employ two (or more) dimensional partitions. This paper describes an algorithmic approach to the multi-dimensional mesh partitioning in code parallelisation, its implementation in a toolkit for almost automatically transforming scalar codes to parallel form, and its testing on a range of ‘real-world’ FORTRAN codes. The concept of multi-dimensional partitioning is straightforward, but non-trivial to represent as a sufficiently generic algorithm so that it can be embedded in a code transformation tool. The results of the tests on fine real-world codes demonstrate clear improvements in parallel performance and scalability (over a 1D partition). This is matched by a huge reduction in the time required to develop the parallel versions when hand coded – from weeks/months down to hours/days.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multilevel algorithms are a successful class of optimization techniques which addresses the mesh partitioning problem. They usually combine a graph contraction algorithm together with a local optimization method which refines the partition at each graph level. In this paper we present an enhancement of the technique which uses imbalance to achieve higher quality partitions. We also present a formulation of the Kernighan-Lin partition optimization algorithm which incorporates load-balancing. The resulting algorithm is tested against a different but related state-of-the-art partitioner and shown to provide improved results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The shared-memory programming model can be an effective way to achieve parallelism on shared memory parallel computers. Historically however, the lack of a programming standard using directives and the limited scalability have affected its take-up. Recent advances in hardware and software technologies have resulted in improvements to both the performance of parallel programs with compiler directives and the issue of portability with the introduction of OpenMP. In this study, the Computer Aided Parallelisation Toolkit has been extended to automatically generate OpenMP-based parallel programs with nominal user assistance. We categorize the different loop types and show how efficient directives can be placed using the toolkit's in-depth interprocedural analysis. Examples are taken from the NAS parallel benchmarks and a number of real-world application codes. This demonstrates the great potential of using the toolkit to quickly parallelise serial programs as well as the good performance achievable on up to 300 processors for hybrid message passing-directive parallelisations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we show how automatic relative debugging can be used to find differences in computation between a correct serial program and an OpenMP parallel version of that program that does not yield correct results. Backtracking and re-execution are used to determine the first OpenMP parallel region that produces a difference in computation that may lead to an incorrect value the user has indicated. Our approach also lends itself to finding differences between parallel computations, where executing with M threads produces expected results but an N thread execution does not (M, N > 1, M ≠ N). OpenMP programs created using a parallelization tool are addressed by utilizing static analysis and directive information from the tool. Hand-parallelized programs, where OpenMP directives are inserted by the user, are addressed by performing data dependence and directive analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the multilevel paradigm and its potential to aid the solution of combinatorial optimisation problems. The multilevel paradigm is a simple one, which involves recursive coarsening to create a hierarchy of approximations to the original problem. An initial solution is found (sometimes for the original problem, sometimes the coarsest) and then iteratively refined at each level. As a general solution strategy, the multilevel paradigm has been in use for many years and has been applied to many problem areas (most notably in the form of multigrid techniques). However, with the exception of the graph partitioning problem, multilevel techniques have not been widely applied to combinatorial optimisation problems. In this paper we address the issue of multilevel refinement for such problems and, with the aid of examples and results in graph partitioning, graph colouring and the travelling salesman problem, make a case for its use as a metaheuristic. The results provide compelling evidence that, although the multilevel framework cannot be considered as a panacea for combinatorial problems, it can provide an extremely useful addition to the combinatorial optimisation toolkit. We also give a possible explanation for the underlying process and extract some generic guidelines for its future use on other combinatorial problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multilevel approaches to computational problems are pervasive across many areas of applied mathematics and scientific computing. The multilevel paradigm uses recursive coarsening to create a hierarchy of approximations to the original problem, then an initial solution is found for the coarsest problem and iteratively refined and improved at each level, coarsest to finest. The solution process is aided by the global perspective (or `global view') imparted to the optimisation by the coarsening. This paper looks at their application to the Vehicle Routing Problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We discuss the application of the multilevel (ML) refinement technique to the Vehicle Routing Problem (VRP), and compare it to its single-level (SL) counterpart. Multilevel refinement recursively coarsens to create a hierarchy of approximations to the problem and refines at each level. A SL algorithm, which uses a combination of standard VRP heuristics, is developed first to solve instances of the VRP. A ML version, which extends the global view of these heuristics, is then created, using variants of the construction and improvement heuristics at each level. Finally some multilevel enhancements are developed. Experimentation is used to find suitable parameter settings and the final version is tested on two well-known VRP benchmark suites. Results comparing both SL and ML algorithms are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We discuss the application of the multilevel (ML) refinement technique to the Vehicle Routing Problem (VRP), and compare it to its single-level (SL) counterpart. Multilevel refinement recursively coarsens to create a hierarchy of approximations to the problem and refines at each level. A SL heuristic, termed the combined node-exchange composite heuristic (CNCH), is developed first to solve instances of the VRP. A ML version (the ML-CNCH) is then created, using the construction and improvement heuristics of the CNCH at each level. Experimentation is used to find a suitable combination, which extends the global view of these heuristics. Results comparing both SL and ML are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new contactless pneumatic microfeeder based on distributed manipulation is proposed. By cooperation of dynamically programmable microactuators, the part to be conveyed floats over an air cushion and is moved to the desired location with the desired orientation. CFD simulations are used to test the validity of the proposed concept and refine the design of the microactuators