loading page

Solving Electromagnetic Scattering Problems with Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA
  • +3
  • Wei-Jia He ,
  • Zeng Yang ,
  • Xiao-Wei Huang ,
  • Wu Wang ,
  • Minglin Yang ,
  • Xin-Qing Sheng
Wei-Jia He
Beijing Institute of Technology

Corresponding Author:[email protected]

Author Profile
Zeng Yang
Author Profile
Xiao-Wei Huang
Author Profile
Minglin Yang
Author Profile
Xin-Qing Sheng
Author Profile


In this paper, a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is presented for solving extremely large electromagnetic scattering problems involving tens of billions of unknowns, In this approach, the flexible and efficient ternary partitioning scheme is employed at first to partition the MLFMA octree among message passing interface (MPI) processes. Then the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, etc., are accelerated by using the GPU. Different parallelization strategies in coincidence with the ternary parallel MLFMA approach are designed for GPU to ensures a high computational throughput. Special memory usage strategy is designed to improve the computational efficiency and benefit data re-using. The CPU/GPU asynchronous computing pattern is designed with the OpenMP and CUDA respectively for accelerating the CPU and GPU execution parts and computation time overlapped. GPU architecture-based optimization strategies are implemented to further improve the computational efficiency. Numerical results demonstrate that the proposed GPU-PMLFMA can achieve over 3 times speed-up, compared with the 8-threaded conventional PMLFMA. Solutions of scattering by electrically large and complicated objects with about 24000 wavelengths and over 41.8 billion unknowns, are presented.
Jul 2022Published in IEEE Transactions on Antennas and Propagation volume 70 issue 7 on pages 5672-5682. 10.1109/TAP.2022.3161520