23 resultados para parallel processinng
em Greenwich Academic Literature Archive - UK
Resumo:
Abstract not available
Resumo:
A new parallel approach for solving a pentadiagonal linear system is presented. The parallel partition method for this system and the TW parallel partition method on a chain of P processors are introduced and discussed. The result of this algorithm is a reduced pentadiagonal linear system of order P \Gamma 2 compared with a system of order 2P \Gamma 2 for the parallel partition method. More importantly the new method involves only half the number of communications startups than the parallel partition method (and other standard parallel methods) and hence is a far more efficient parallel algorithm.
Resumo:
Abstract not available
Resumo:
A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
A parallel method for the dynamic partitioning of unstructured meshes is outlined. The method includes diffusive load-balancing techniques and an iterative optimisation technique known as relative gain optimisationwhich both balances theworkload and attempts to minimise the interprocessor communications overhead. It can also optionally include amultilevel strategy. Experiments on a series of adaptively refined meshes indicate that the algorithmprovides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
Resumo:
The difficulties encountered in implementing large scale CM codes on multiprocessor systems are now fairly well understood. Despite the claims of shared memory architecture manufacturers to provide effective parallelizing compilers, these have not proved to be adequate for large or complex programs. Significant programmer effort is usually required to achieve reasonable parallel efficiencies on significant numbers of processors. The paradigm of Single Program Multi Data (SPMD) domain decomposition with message passing, where each processor runs the same code on a subdomain of the problem, communicating through exchange of messages, has for some time been demonstrated to provide the required level of efficiency, scalability, and portability across both shared and distributed memory systems, without the need to re-author the code into a new language or even to support differing message passing implementations. Extension of the methods into three dimensions has been enabled through the engineering of PHYSICA, a framework for supporting 3D, unstructured mesh and continuum mechanics modeling. In PHYSICA, six inspectors are used. Part of the challenge for automation of parallelization is being able to prove the equivalence of inspectors so that they can be merged into as few as possible.
Resumo:
Abstract not available
Resumo:
This chapter describes a parallel optimization technique that incorporates a distributed load-balancing algorithm and provides an extremely fast solution to the problem of load-balancing adaptive unstructured meshes. Moreover, a parallel graph contraction technique can be employed to enhance the partition quality and the resulting strategy outperforms or matches results from existing state-of-the-art static mesh partitioning algorithms. The strategy can also be applied to static partitioning problems. Dynamic procedures have been found to be much faster than static techniques, to provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data. The method employs a new iterative optimization technique that balances the workload and attempts to minimize the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. The dynamic evolution of load has three major influences on possible partitioning techniques; cost, reuse, and parallelism. The unstructured mesh may be modified every few time-steps and so the load-balancing must have a low cost relative to that of the solution algorithm in between remeshing.
Resumo:
This paper describes an parallel semi-Lagrangian finite difference approach to the pricing of early exercise Asian Options on assets with a stochastic volatility. A multigrid procedure is described for the fast iterative solution of the discrete linear complementarity problems that result. The accuracy and performance of this approach is improved considerably by a strike-price related analytic transformation of asset prices. Asian options are contingent claims with payoffs that depend on the average price of an asset over some time interval. The payoff may depend on this average and a fixed strike price (Fixed Strike Asians) or it may depend on the average and the asset price (Floating Strike Asians). The option may also permit early exercise (American contract) or confine the holder to a fixed exercise date (European contract). The Fixed Strike Asian with early exercise is considered here where continuous arithmetic averaging has been used. Pricing such an option where the asset price has a stochastic volatility leads to the requirement to solve a tri-variate partial differential inequation in the three state variables of asset price, average price and volatility (or equivalently, variance). The similarity transformations [6] used with Floating Strike Asian options to reduce the dimensionality of the problem are not applicable to Fixed Strikes and so the numerical solution of a tri-variate problem is necessary. The computational challenge is to provide accurate solutions sufficiently quickly to support realtime trading activities at a reasonable cost in terms of hardware requirements.
Resumo:
A parallel method for dynamic partitioning of unstructured meshes is described. The method employs a new iterative optimisation technique which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.