2 resultados para vectorization
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
A parallel strategy for solving multidimensional tridiagonal equations is investigated in this paper. We present in detail an improved version of single parallel partition (SPP) algorithm in conjunction with message vectorization, which aggregates several communication messages into one to reduce the communication cost. We show the resulting block SPP can achieve good speedup for a wide range of message vector length (MVL), especially when the number of grid points in the divided direction is large. Instead of only using the largest possible MVL, we adopt numerical tests and modeling analysis to determine an optimal MVL so that significant improvement in speedup can be obtained.
Resumo:
It has long been recognized that many direct parallel tridiagonal solvers are only efficient for solving a single tridiagonal equation of large sizes, and they become inefficient when naively used in a three-dimensional ADI solver. In order to improve the parallel efficiency of an ADI solver using a direct parallel solver, we implement the single parallel partition (SPP) algorithm in conjunction with message vectorization, which aggregates several communication messages into one to reduce the communication costs. The measured performances show that the longest allowable message vector length (MVL) is not necessarily the best choice. To understand this observation and optimize the performance, we propose an improved model that takes the cache effect into consideration. The optimal MVL for achieving the best performance is shown to depend on number of processors and grid sizes. Similar dependence of the optimal MVL is also found for the popular block pipelined method.