loading page

Sound source localization with multi-feature fusion using residuals and channel attention
  • Nipun Agarwal
Nipun Agarwal
Birla Institute of Technology and Science

Corresponding Author:[email protected]

Author Profile


Recent advances in deep learning have enhanced the ability of sound source localization in noise and reverberation. However, the single feature input and relatively simple network design hinder the further improvement of such ability. Therefore, a multi-feature fusion sound source localization method is proposed based on residuals and channel attention. The strategy of multi-feature fusion provides a deep neural network with more comprehensive discriminative features. The deep neural network fully extracts the sound source location information from the fused features by introducing residuals and channel attention. The simulations show that the localization accuracies of the proposed method in single and multiple sound source scenarios are respectively 8.24% and 15.54% higher than those of the single feature convolutional neural network (SF-CNN). In addition, the proposed method has excellent performance under different signal-to-noise ratios and reverberation times, which verifies its robustness to noise and reverberation. In the experiment, the proposed method is still effective in localizing single and multiple sound sources, and its localization accuracies are 5.48% and 7.57% higher than that of SF-CNN. With high accuracy and strong robustness, this method is significant for sound source localization in complex environments, such as noise, reverberation, and the presence of multiple sound sources.