Predicting Phoneme and Word Recognition using a Computational Model in Normal-Hearing Listeners

Arturo Moncada-Torres, Raphael Koning, Jan Wouters, and Tom Francart
ExpORL, Dept. of Neurosciences, KU Leuven, Herestraat 49, bus 721, 3000 Leuven, Belgium

Behavioural and psychophysical measurements in audiology are currently a challenging and resource consuming task. Computational models are complementary to this classical approach, since they present several advantages, such as reproducibility, scalability, easily parametrization, and ease of use.

The purpose of this work was to predict the scores of behavioural measurements using the model proposed by Zilany and Bruce (2006). First, phoneme and word scores were obtained from 20 normal-hearing adults using the Lilliput speech material (378 CVC words organised in 20 lists) under five different signal-to-noise (SNR) conditions. Scores were averaged across subjects. Then, for each SNR condition, a clean version and the noisy version of the stimulus were input to the model, which yielded their corresponding auditory-nerve response in the form of a neurogram. The neurogram similarity index measure (NSIM) was used to quantify the resemblance between them. Analysis of the behavioural scores showed that vowels were easier to identify than consonants, having a speech recognition threshold of -12.2 and -9.6 dB, respectively. Furthermore, regarding the computational model, preliminary results showed significant moderate to strong correlations ranging from 0.41 to 0.75 (median is 0.545, p < 0.05) between the NSIM metric and behavioural scores at a word level across lists. Future work will be focused in further simulations at the phoneme level using Zilany and Bruce’s model, as well as in investigating its sensitivity to phoneme transitions.

Webmjhx0asterdx8x (katkmqja.warnkenkf5k@uol.defic) (Stand: 07.11.2019)