TechRxiv
naimul_asfnet2023_preprint.pdf (2.72 MB)
Download file

ASFNet: Audio Spectrogram Fourier Network for Efficient Medical Sound Event Detection

Download (2.72 MB)
preprint
posted on 2023-07-25, 02:57 authored by K. M. NAIMUL HASSANK. M. NAIMUL HASSAN, Mohammad Ariful Haque

Sound event detection (SED) in the medical environment can be helpful in accomplishing different healthcare tasks. Due to the success of transformer encoder architectures for sound event detection, they seem to be a promising choice for detecting audio events in hospital settings. However, there are two main difficulties in detecting medical audio events with transformers. Firstly, the availability of medical audio data is extremely limited, making it difficult to effectively train a transformer model. Secondly, it is necessary for the SED model to be computationally efficient in order to be deployed in medical environments with limited resources. But, the transformer has high computational complexity because of the attention mechanism. To address these challenges, this paper introduces the Audio Spectrogram Fourier Network (ASFNet), a novel attention-free transformer encoder designed specifically for sound event detection in the medical environment. ASFNet replaces the attention operation with a simplified Fast Fourier Transform. By leveraging this approach, ASFNet outperforms the other methods, achieving a superior average mAP of 0.474 with a 16.76% relative improvement. ASFNet achieves this performance with fewer model parameters and smaller model size, making it an efficient and effective solution for medical audio event detection.

History

Email Address of Submitting Author

naimul.hassan273@gmail.com

ORCID of Submitting Author

0000-0001-6282-4839

Submitting Author's Institution

Bangladesh University of Engineering and Technology

Submitting Author's Country

  • Bangladesh