Automatic Speech and Audio Processing (ASAP)

Our group develops methods for improved machine listening, which is motivated by the remarkable properties of the healthy auditory system of listeners. We combine deep machine learning, large data sets, and knowledge about acoustics and physiology to improve automated speech processing. The findings are applied to advance the fields of auditory healthcare, acoustics, and auditory neuroscience. 

Some of the topics we're working on:

  • Development of self-administered hearing screening tests that can be performed by anybody on a daily basis (in contrast to time-consuming and expensive clinical testing).
  • Models for predicting a listener's behaviour or perception, e.g., the listening effort or speech intelligibility based on deep learning. These models are designed to work in real-life scenarios, which creates the potential of using them in individualized hearing devices for constant monitoring and optimization of signals presented to the listener.
  • Analysis of biodata (obtained by EEG, MEG, or ECoG measurements) to determine which speaker is attended by the listener in an acoustic scene. The result of this analysis can be fed back to a hearing device for signal optimization (e.g., spatial filtering) and therefore improved speech intelligibility. Our methods should also provide answers to how our auditory system effectively encodes speech recognition and understanding.
  • Improvement of automatic speech recognition in complex acoustic scenes with multiple noise sources and reverberation. Can we estimate properties of our surroundings (e.g., reverberation time) and use this for adaptation to a room? (Spoiler alert: Yes, we can). 


Standing (left to right): Tom, Nils, Bernd, Tobi.
Sitting (left to right): Paul, Jasper, Jana.
Digitally represented via Skype: Angel.


Webmaster (Stand: 10.09.2018)