Performance Analysis of DFT and FFT Algorithms on Modern GPUs.pdf (724.4 kB)
Download file

Performance Analysis of DFT and FFT Algorithms on Modern GPUs

Download (724.4 kB)

Conventionally, the Fast Fourier Transform (FFT) has been adopted over the Discrete Fourier Transform (DFT) due to its faster execution. However, the emergence of modern high performance computing devices has favored the DFT algorithm due to its inherent parallelism. This letter explores a straightforward one-dimensional DFT whose performance is evaluated against the NVIDIA and AMD’s highly optimized FFT libraries, cuFFT and clFFT, respectively. Performance is analyzed in terms of average kernel execution time for data ranging from 21 to 225 samples across the NVIDIA GeForce GTX 1080 Ti and AMD Radeon RX 6800 XT graphical processors. The DFT algorithm achieves comparable results to the FFT routines for smaller input sizes whereas it significantly outperforms the FFT libraries for larger input lengths. The DFT shows an average performance increase of 177.7% over the cuFFT on the NVIDIA GeForce device using the CUDA toolkit. For the AMD Radeon graphical processor, the DFT outperforms the clFFT by an average of 184.2% using the OpenCL specification. 


Email Address of Submitting Author

ORCID of Submitting Author


Submitting Author's Institution

University of Dayton

Submitting Author's Country

  • United States of America