Workshop on "Computational Audition"



M. Heckmann - Hierarchical Spectro-Temporal Speech Features

Martin Heckmann, Xavier Domont, Samuel Ngouoko, Honda Research Institute Europe GmbH

In this presentation I will present a hierarchical framework for the extraction of spectro-temporal acoustic features. The design of the features targets higher robustness in dynamic environments. Motivated by the large gap between human and machine performance in such conditions we take inspirations from the organization of the mammalian auditory cortex in the design of our features. This includes the joint processing of spectral and temporal information, the organization in hierarchical layers, competition between coequal features, the use of high-dimensional sparse feature spaces, and the learning of the underlying receptive fields in a data-driven manner. Due to these properties we termed the features as Hierarchical Spectro-Temporal (HIST) features.I will demonstrate via recognition results obtained in different environments that these features deliver complementary information to conventional spectral features and that this information improves recognition performance. Furthermore, I will highlight how a discriminate sub-space projection of the features can be used to further improve performance and how these features can adapt to different noise scenarios via an adaptive feature competition.

(Stand: 16.03.2023)  |