2022-TAP[J]-WeijiaHe-GPU-PMLFMA.pdf (1.46 MB)
Download fileSolving Electromagnetic Scattering Problems with Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA
preprint
posted on 2022-05-26, 17:35 authored by Wei-Jia HeWei-Jia He, Zeng Yang, Xiao-Wei Huang, Wu Wang, Minglin Yang, Xin-Qing ShengIn this paper, a massively
parallel approach of the multilevel fast multipole algorithm (PMLFMA) on
graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is
presented for solving extremely large electromagnetic scattering problems
involving tens of billions of unknowns, In this approach, the flexible and
efficient ternary partitioning scheme is employed at first to partition the
MLFMA octree among message passing interface (MPI) processes. Then the
computationally intensive parts of the PMLFMA on each MPI process, matrix
filling, aggregation and disaggregation, etc., are accelerated by using the
GPU. Different parallelization strategies in coincidence with the ternary parallel
MLFMA approach are designed for GPU to ensures
a high computational throughput. Special memory
usage strategy is designed to improve the computational efficiency and benefit
data re-using. The CPU/GPU asynchronous computing pattern is designed with the
OpenMP and CUDA respectively for accelerating the CPU and GPU execution parts
and computation time overlapped. GPU architecture-based optimization strategies
are implemented to further improve the computational efficiency. Numerical
results demonstrate that the proposed GPU-PMLFMA can achieve over 3 times speed-up,
compared with the 8-threaded conventional PMLFMA.
Solutions of scattering by electrically large and
complicated objects with about 24000 wavelengths and over 41.8 billion
unknowns, are presented.