Prof. Dr. Dr. Birger Kollmeier

+49 (0)441 798-5466 / -5470

W30 3-313


Katja Warnken

+49 (0)441 798-5470

+49 (0)441 798-3902

W30 3-312

Anschrift (Postanschrift)

Medizinische Physik, Fakultät VI
Universität Oldenburg
26111 Oldenburg

Standort / Anreise

Für spezifische Fragen bezüglich eines unserer Forschungsthemen kontaktieren Sie bitte die entsprechenden Personen direkt (siehe Liste der Mitarbeiter)

Joint estimation of reverberation time and early-to-late reverberation ratio from single-channel speech signals

Feifei Xiong, Stefan Goetze, Birger Kollmeier, Bernd T. Meyer (2019)
IEEE/ACM Transactions on Audio, Speech, and Language Processing 27(2), 255-267, Feb 2019, doi: 10.1109/TASLP.2018.2877894


The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key parameters commonly used to characterize acoustic room environments. In contrast to conventional blind estimation methods that process the two parameters separately, we propose a model for joint estimation to predict the RT and the ELR simultaneously from single-channel speech signals from either full-band or sub-band frequency data, which is referred to as joint room parameter estimator (jROPE). An artificial neural network is employed to learn the mapping from acoustic observations to the RT and the ELR classes. Auditory-inspired acoustic features obtained by temporal modulation filtering of the speech time-frequency representations are used as input for the neural network. Based on an in-depth analysis of the dependency between the RT and the ELR, a two-dimensional (RT, ELR) distribution with constrained boundaries is derived, which is then exploited to evaluate four different configurations for jROPE. Experimental results show that-in comparison to the single-task ROPE system which individually estimates the RT or the ELR-jROPE provides improved results for both tasks in various reverberant and (diffuse) noisy environments. Among the four proposed joint types, the one incorporating multi-task learning with shared input and hidden layers yields the best estimation accuracies on average. When encountering extreme reverberant conditions with RTs and ELRs lying beyond the derived (RT, ELR) distribution, the type considering RT and ELR as a joint parameter performs robustly, in particular. From state-of-the-art algorithms that were tested in the acoustic characterization of environments challenge, jROPE achieves comparable results among the best for all individual tasks (RT and ELR estimation from full-band and sub-band signals).

Webe45master ( (Stand: 21.08.2020)