27 resultados para parallel scalability
em Greenwich Academic Literature Archive - UK
Resumo:
The difficulties encountered in implementing large scale CM codes on multiprocessor systems are now fairly well understood. Despite the claims of shared memory architecture manufacturers to provide effective parallelizing compilers, these have not proved to be adequate for large or complex programs. Significant programmer effort is usually required to achieve reasonable parallel efficiencies on significant numbers of processors. The paradigm of Single Program Multi Data (SPMD) domain decomposition with message passing, where each processor runs the same code on a subdomain of the problem, communicating through exchange of messages, has for some time been demonstrated to provide the required level of efficiency, scalability, and portability across both shared and distributed memory systems, without the need to re-author the code into a new language or even to support differing message passing implementations. Extension of the methods into three dimensions has been enabled through the engineering of PHYSICA, a framework for supporting 3D, unstructured mesh and continuum mechanics modeling. In PHYSICA, six inspectors are used. Part of the challenge for automation of parallelization is being able to prove the equivalence of inspectors so that they can be merged into as few as possible.
Resumo:
Abstract not available
Resumo:
Finance is one of the fastest growing areas in modern applied mathematics with real world applications. The interest of this branch of applied mathematics is best described by an example involving shares. Shareholders of a company receive dividends which come from the profit made by the company. The proceeds of the company, once it is taken over or wound up, will also be distributed to shareholders. Therefore shares have a value that reflects the views of investors about the likely dividend payments and capital growth of the company. Obviously such value will be quantified by the share price on stock exchanges. Therefore financial modelling serves to understand the correlations between asset and movements of buy/sell in order to reduce risk. Such activities depend on financial analysis tools being available to the trader with which he can make rapid and systematic evaluation of buy/sell contracts. There are other financial activities and it is not an intention of this paper to discuss all of these activities. The main concern of this paper is to propose a parallel algorithm for the numerical solution of an European option. This paper is organised as follows. First, a brief introduction is given of a simple mathematical model for European options and possible numerical schemes of solving such mathematical model. Second, Laplace transform is applied to the mathematical model which leads to a set of parametric equations where solutions of different parametric equations may be found concurrently. Numerical inverse Laplace transform is done by means of an inversion algorithm developed by Stehfast. The scalability of the algorithm in a distributed environment is demonstrated. Third, a performance analysis of the present algorithm is compared with a spatial domain decomposition developed particularly for time-dependent heat equation. Finally, a number of issues are discussed and future work suggested.
Resumo:
A new parallel approach for solving a pentadiagonal linear system is presented. The parallel partition method for this system and the TW parallel partition method on a chain of P processors are introduced and discussed. The result of this algorithm is a reduced pentadiagonal linear system of order P \Gamma 2 compared with a system of order 2P \Gamma 2 for the parallel partition method. More importantly the new method involves only half the number of communications startups than the parallel partition method (and other standard parallel methods) and hence is a far more efficient parallel algorithm.
Resumo:
Abstract not available
Resumo:
A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Unstructured mesh based codes for the modelling of continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Such codes have the potential to provide a high performance on parallel platforms for a small investment in programming. The critical parameters for success are to minimise changes to the code to allow for maintenance while providing high parallel efficiency, scalability to large numbers of processors and portability to a wide range of platforms. The paradigm of domain decomposition with message passing has for some time been demonstrated to provide a high level of efficiency, scalability and portability across shared and distributed memory systems without the need to re-author the code into a new language. This paper addresses these issues in the parallelisation of a complex three dimensional unstructured mesh Finite Volume multiphysics code and discusses the implications of automating the parallelisation process.
Resumo:
Abstract not available
Resumo:
A parallel method for the dynamic partitioning of unstructured meshes is outlined. The method includes diffusive load-balancing techniques and an iterative optimisation technique known as relative gain optimisationwhich both balances theworkload and attempts to minimise the interprocessor communications overhead. It can also optionally include amultilevel strategy. Experiments on a series of adaptively refined meshes indicate that the algorithmprovides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
Resumo:
Abstract not available
Resumo:
This chapter describes a parallel optimization technique that incorporates a distributed load-balancing algorithm and provides an extremely fast solution to the problem of load-balancing adaptive unstructured meshes. Moreover, a parallel graph contraction technique can be employed to enhance the partition quality and the resulting strategy outperforms or matches results from existing state-of-the-art static mesh partitioning algorithms. The strategy can also be applied to static partitioning problems. Dynamic procedures have been found to be much faster than static techniques, to provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data. The method employs a new iterative optimization technique that balances the workload and attempts to minimize the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. The dynamic evolution of load has three major influences on possible partitioning techniques; cost, reuse, and parallelism. The unstructured mesh may be modified every few time-steps and so the load-balancing must have a low cost relative to that of the solution algorithm in between remeshing.