Project B3 - Hierarchical models of acoustic information processing and their application for source detection and enhancement
Project B3 - Hierarchical models of acoustic information processing and their application for source detection and enhancement
Project B3 - Hierarchical models of acoustic information processing and their application for source detection and enhancement
At the core of this project is the CRC’s acoustic communication loop, implemented as a hierarchy of consecutive processing layers, in which the sound field is transformed into an increasingly abstract and invariant („high-level“) representation. Subsequent to an active listening decision at the top-level, the counterpart to the subject's percept, the hierarchy is traversed in reverse („top-down“) direction.
This facilitates inference of optimal parameters of an assistive device that enhances the sound field in accordance with the user’s wish. To achieve this goal, methods from psychoacoustics for feature extraction, deep neural networks for learning in hierarchical architectures and statistical signal processing for optimal parameter inference are combined.
Publications
2024
- Nustede EJ, Anemüller J (2024) On the generalization ability of complex-valued variational U-networks for single-channel speech enhancement. IEEE/ACM Transactions on Audio, Speech, and Language Processing 32: 3838-3849. DOI: 10.1109/TASLP.2024.3444492
2023
- Nustede EJ, Anemueller J (2023) Exploring visualization techniques for interpretable learning in speech enhancement deep neural networks. Proc. ITG Conference on Speech Communication, Aachen, Germany, Sep. 2023, pp. 220-224. DOI: 10.30420/456164043,
https://ieeexplore.ieee.org/document/10363031 - Nustede EJ, Anemüller J (2023) Single-channel speech enhancement with deep complex U-networks and probabilistic latent space models. 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 04-10.06.2023, pp. 1-5. DOI: 10.1109/ICASSP49357.2023.10096208
2022
2021
- Nustede EJ, Anemüller J (2021) Towards speech enhancement using a variational U-Net architecture. In Proc. European Signal Processing Conference (EUSIPCO), Dublin, Ireland, Aug. 2021. https://eurasip.org/Proceedings/Eusipco/Eusipco2021/pdfs/0000481.pdf
- Tammen M, Gode H, Kayser H, Nustede EJ, Westhausen NL, Anemüller J, Doclo S
(2021) Combining binaural LCMP beamforming and deep multi-frame filtering for joint dereverberation and interferer reduction in the Clarity-2021 Challenge. Technical report. https://claritychallenge.org/clarity2021-workshop/papers/Clarity_2021_paper_tammen.pdf - Urbschat A, Uppenkamp S, Anemüller J (2021) Searchlight classification informative region mixture model (SCIM): Identification of cortical regions showing discriminable BOLD patterns in event-related auditory fMRI data. Front Neurosci 14: 616906, 1-21. DOI: 10.3389/fnins.2020.616906
2020
- Anemüller J, Schoof H (2020) Machine listening in spatial acoustic scenes with deep networks in different microphone geometries. Northern Lights Deep Learning Workshop (NLDL). DOI: 10.7557/18.5151
- Urbschat A, Uppenkamp S, Anemüller J (2020) Searchlight Classification Informative Region Mixture Model (SCIM): Identification of cortical regions showing discriminable BOLD patterns in event-related auditory fMRI data. Frontiers in Neuroscience, 14:616906. DOI: 10.3389/fnins.2020.616906
2019
- Anemüller J, Schoof, H (2019) Deep network source localization and the influence of sensor geometry. 23rd Proc. International Congress on Acoustics, Aachen, pp 110-113. pub.dega-akustik.de/ICA2019/data/articles/001302.pdf
- Nustede EJ, Anemüller J (2019) Group delay features for sound event detection and localization (task 3) of the DCASE 2019 challenge. Detection Classification Acoust. Scenes Events Challenge, Technical report. Link to the paper