Contact

Head of division

Prof. Dr. Dr. Birger Kollmeier

+49 (0)441 798 5466 oder 5470

W30 3-313

Office

Katja Warnken

+49 (0)441 798 5470

+49 (0)441 798-3902

W30 3-312

Kirsten Scheel

+49 (0)441 798-3813

+49 (0)441 798-3902

W30 3-312

Address (Mail address)

Medizinische Physik, Fakultät VI
Universität Oldenburg
26111 Oldenburg

Location / How to find us

For specific questions regarding one of our research topics, please contact the respective people directly (see staff list).

Paper Spille Kollmeier Meyer 2017

Combining binaural and cortical features for robust speech recognition

Constantin Spille, Birger Kollmeier and Bernd T. Meyer (2017)
IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(4), 756 - 767, April 2017

The segregation of concurrent speakers and other sound sources is an important ability of the human auditory system, but is missing in most current systems for automatic speech recognition (ASR), resulting in a large gap between human and machine performance. This study combines processing related to peripheral and cortical stages of the auditory pathway: A physiologically-motivated binaural model estimates the positions of moving speakers to enhance the desired speech signal. Secondly, signals are converted to spectro-temporal Gabor features that resemble cortical speech representations and which have been shown to improve ASR in noisy conditions. Spectro-temporal Gabor features improve recognition results in all acoustic conditions under consideration compared to Melfrequency cepstral coefficients (MFCC). Binaural processing results in lower WERs in acoustic scenes with a concurrent speaker, whereas monaural processing should be preferred in the presence of a stationary masking noise. In-depth analysis of binaural processing identifies crucial processing steps such as localization of sound sources and estimation of the beamformer’s noise coherence matrix, and shows how much each processing step affects the recognition performance in acoustic conditions with different complexity.

Link to the publication

(Changed: 05 Apr 2023)  | 
Zum Seitananfang scrollen Scroll to the top of the page