loading page

Neural Comb Filtering using Sliding Window Attention Network for Speech Enhancement
  • Venkatesh Parvathala ,
  • Sri Rama Murty Kodukula ,
  • Siva Ganesh Andhavarapu
Venkatesh Parvathala
RGUKT-RK Valley

Corresponding Author:[email protected]

Author Profile
Sri Rama Murty Kodukula
Author Profile
Siva Ganesh Andhavarapu
Author Profile

Abstract

In this paper, we demonstrate the significance of restoring harmonics of the fundamental frequency (pitch) in deep neural network (DNN) based speech enhancement. We propose a sliding-window attention network to regress the spectral magnitude mask (SMM) from the noisy speech signal. Even though the network parameters can be estimated by minimizing the mask loss, it does not restore the pitch harmonics, especially at higher frequencies. In this paper, we propose to restore the pitch harmonics in the spectral domain by minimizing cepstral loss around the pitch peak. The network parameters are estimated using a combination of the mask loss and cepstral loss. The proposed network architecture functions like an adaptive comb filter on voiced segments, and emphasizes the pitch harmonics in the speech spectrum. The proposed approach achieves comparable performance with the state-of-the-art methods with much lesser computational complexity.
Jan 2023Published in Circuits, Systems, and Signal Processing volume 42 issue 1 on pages 322-343. 10.1007/s00034-022-02123-2