Performance Analysis of DFT and FFT Algorithms on Modern GPUs
Conventionally, the Fast Fourier Transform (FFT) has been adopted over the Discrete Fourier Transform (DFT) due to its faster execution. However, the emergence of modern high performance computing devices has favored the DFT algorithm due to its inherent parallelism. This letter explores a straightforward one-dimensional DFT whose performance is evaluated against the NVIDIA and AMD’s highly optimized FFT libraries, cuFFT and clFFT, respectively. Performance is analyzed in terms of average kernel execution time for data ranging from 21 to 225 samples across the NVIDIA GeForce GTX 1080 Ti and AMD Radeon RX 6800 XT graphical processors. The DFT algorithm achieves comparable results to the FFT routines for smaller input sizes whereas it significantly outperforms the FFT libraries for larger input lengths. The DFT shows an average performance increase of 177.7% over the cuFFT on the NVIDIA GeForce device using the CUDA toolkit. For the AMD Radeon graphical processor, the DFT outperforms the clFFT by an average of 184.2% using the OpenCL specification.
Email Address of Submitting Authordavuluruv1@udayton.edu
ORCID of Submitting Author0000-0002-4394-5405
Submitting Author's InstitutionUniversity of Dayton
Submitting Author's Country
- United States of America