21 resultados para Dynamic load

em Greenwich Academic Literature Archive - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parallel computing is now widely used in numerical simulation, particularly for application codes based on finite difference and finite element methods. A popular and successful technique employed to parallelize such codes onto large distributed memory systems is to partition the mesh into sub-domains that are then allocated to processors. The code then executes in parallel, using the SPMD methodology, with message passing for inter-processor interactions. In order to improve the parallel efficiency of an imbalanced structured mesh CFD code, a new dynamic load balancing (DLB) strategy has been developed in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to coincidental limits. The ‘local’ partition limit change allows greater flexibility in obtaining a balanced load distribution, as the workload increase, or decrease, on a processor is no longer restricted by the ‘global’ (coincidental) limit change. The automatic implementation of this generic DLB strategy within an existing parallel code is presented in this chapter, along with some preliminary results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic load-balancing for parallel (message-passing) mesh-based applications. The target applications are those with dynamic and solution-adaptive features. The focus within the DRAMA project was on finite element simulation codes for structural mechanics. An introduction to the DRAMA library will illustrate that the very general cost model and the interface designed specifically for application requirements provide simplified and effective access to a range of parallel partitioners. The main body of the paper will demonstrate the ability to provide dynamic load-balancing for parallel FEM problems that include: adaptive meshing, re-meshing, the need for multi-phase partitioning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as Computational Fluid Dynamics. This is normally achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this paper, we demonstrate how typical office-based PCs attached to a local area network have the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. A dynamic load balancing scheme was devised to allow the effective use of the software on heterogeneous PC networks. This scheme ensured that the impact between the parallel processing task and other computer users on the network was minimized thus allowing practical parallel processing within a conventional office environment. Copyright © 2006 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter describes a parallel optimization technique that incorporates a distributed load-balancing algorithm and provides an extremely fast solution to the problem of load-balancing adaptive unstructured meshes. Moreover, a parallel graph contraction technique can be employed to enhance the partition quality and the resulting strategy outperforms or matches results from existing state-of-the-art static mesh partitioning algorithms. The strategy can also be applied to static partitioning problems. Dynamic procedures have been found to be much faster than static techniques, to provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data. The method employs a new iterative optimization technique that balances the workload and attempts to minimize the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. The dynamic evolution of load has three major influences on possible partitioning techniques; cost, reuse, and parallelism. The unstructured mesh may be modified every few time-steps and so the load-balancing must have a low cost relative to that of the solution algorithm in between remeshing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A parallel method for dynamic partitioning of unstructured meshes is described. The method employs a new iterative optimisation technique which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the complexity of parallel applications increase, the performance limitations resulting from computational load imbalance become dominant. Mapping the problem space to the processors in a parallel machine in a manner that balances the workload of each processors will typically reduce the run-time. In many cases the computation time required for a given calculation cannot be predetermined even at run-time and so static partition of the problem returns poor performance. For problems in which the computational load across the discretisation is dynamic and inhomogeneous, for example multi-physics problems involving fluid and solid mechanics with phase changes, the workload for a static subdomain will change over the course of a computation and cannot be estimated beforehand. For such applications the mapping of loads to process is required to change dynamically, at run-time in order to maintain reasonable efficiency. The issue of dynamic load balancing are examined in the context of PHYSICA, a three dimensional unstructured mesh multi-physics continuum mechanics computational modelling code.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new dynamic load balancing technique for structured mesh computational mechanics codes in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to using coincidental limits in all of the partitioned dimensions. The partition range limits are 'staggered', allowing greater flexibility in obtaining a balanced load distribution in comparison to when the limits are changed 'globally'. as the load increase/decrease on one processor no longer restricts the load decrease/increase on a neighbouring processor. The automatic implementation of this 'staggered' load balancing strategy within an existing parallel code is presented in this paper, along with some preliminary results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In many areas of simulation, a crucial component for efficient numerical computations is the use of solution-driven adaptive features: locally adapted meshing or re-meshing; dynamically changing computational tasks. The full advantages of high performance computing (HPC) technology will thus only be able to be exploited when efficient parallel adaptive solvers can be realised. The resulting requirement for HPC software is for dynamic load balancing, which for many mesh-based applications means dynamic mesh re-partitioning. The DRAMA project has been initiated to address this issue, with a particular focus being the requirements of industrial Finite Element codes, but codes using Finite Volume formulations will also be able to make use of the project results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a relative gain iterative technique to both evenly balance the workload and minimise the number and volume of interprocessor communications. A parallel graph reduction technique is also briefly described and can be used to give a global perspective to the optimisation. The algorithms work efficiently in parallel as well as sequentially and when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting two-stage process proves itself to be both a powerful and flexible solution to the static graph-partitioning problem. Experiments indicate that the resulting parallel code can provide high quality partitions, independent of the initial partition, within a few seconds. The algorithms can also be used for dynamic load-balancing, reusing existing partitions and in this case the procedures are much faster than static techniques, provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A method is outlined for optimising graph partitions which arise in mapping un- structured mesh calculations to parallel computers. The method employs a combination of iterative techniques to both evenly balance the workload and minimise the number and volume of interprocessor communications. They are designed to work efficiently in parallel as well as sequentially and when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting two-stage process proves itself to be both a powerful and flexible solution to the static graph-partitioning problem. The algorithms can also be used for dynamic load-balancing and a clustering technique can additionally be employed to speed up the whole process. Experiments indicate that the resulting parallel code can provide high quality partitions, independent of the initial partition, within a few seconds.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this Chapter we discuss the load-balancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the load-balancing problem and the optimisation problem, both locally and globally. We also briefly discuss whether repartitioning is always valid. Sample illustrative results are presented and we conclude that repartitioning is an attractive option if the load changes are not too dramatic and that there is a certain trade-off between partition quality and volume of data that the underlying application needs to migrate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a dynamic distributed load balancing algorithm for parallel, adaptive Finite Element simulations in which we use preconditioned Conjugate Gradient solvers based on domain-decomposition. The load balancing is designed to maintain good partition aspect ratio and we show that cut size is not always the appropriate measure in load balancing. Furthermore, we attempt to answer the question why the aspect ratio of partitions plays an important role for certain solvers. We define and rate different kinds of aspect ratio and present a new center-based partitioning method of calculating the initial distribution which implicitly optimizes this measure. During the adaptive simulation, the load balancer calculates a balancing flow using different versions of the diffusion algorithm and a variant of breadth first search. Elements to be migrated are chosen according to a cost function aiming at the optimization of subdomain shapes. Experimental results for Bramble's preconditioner and comparisons to state-of-the-art load balancers show the benefits of the construction.