Single Channel Noise Reduction based on an Auditory Filterbank

Steffen Kortlang, Stephan D. Ewert, and Timo Gerkmann
CvO Universität Oldenburg

Many noise reduction algorithms are designed in the short-time Fourier transform (STFT) domain. The STFT analysis results in frequency bands with a constant bandwidth. In contrast, perceptually motivated analysis-resynthesis filterbanks, such as the Gammatone filterbank, result in a higher frequency resolution in low frequencies as compared to high frequencies. This variable frequency resolution goes along with a changed temporal resolution and thus potentially different temporal correlations in the different frequency bands. In this paper, we design a noise power spectral density estimator at the output of a Gammatone filterbank. For this, we employ the state-of-the-art speech presence probability based estimator (Gerkmann and Hendriks, 2012). While the algorithm was originally designed in the STFT domain, here, the parameters are adjusted based on a statistical analysis of the changed temporal correlation at the output of the Gammatone filters. The proposed approach yields a comparable instrumentally predicted quality as the STFT-based baseline approach and thus allows for the integration of noise reduction with other algorithms that work in a perceptually motivated spectral domain.

Weoid+hbmasteresifx (katja.cgwarr9nken@uol.deykig) (Changed: 2020-01-23)