We proposed a new objective intelligibility measure (OIM), called the Gammachirp Envelope Similarity Index (GESI), which can predict the speech intelligibility (SI) of simulated hearing loss (HL) sounds for normal hearing (NH) listeners. GESI is an intrusive method that computes the SI metric using the gammachirp filterbank (GCFB), the modulation filterbank, and the extended cosine similarity measure. GESI can accept the level asymmetry of the reference and test sounds and reflect the HI listener's hearing level as it appears on the audiogram. A unique feature of GESI is its ability to incorporate an individual participant's listening condition into the SI prediction. We conducted four SI experiments on male and female speech sounds in both laboratory and crowdsourced remote environments. We then evaluated GESI and the conventional OIMs, STOI, ESTOI, MBSTOI, and HASPI, for their ability to predict mean and individual SI values with and without the use of simulated HL sounds. GESI outperformed the other OIMs in all evaluations. STOI, ESTOI, and MBSTOI did not predict SI at all, even when using the simulated HL sounds. HASPI did not predict the difference between the laboratory and remote experiments on male speech sounds and the individual SI values. GESI may provide a first step toward SI prediction for individual HI listeners whose HL is caused solely by peripheral dysfunction.
翻译:摘要:我们提出了一种新的客观清晰度度量(OIM),称为伽马啁啾包络相似度指数(GESI),它能够预测正常听力(NH)听众对模拟听力损失(HL)语音的语音清晰度(SI)。GESI是一种侵入性方法,通过伽马啁啾滤波器组(GCFB)、调制滤波器组和扩展余弦相似度度量来计算SI指标。GESI能够接受参考语音和测试语音的电平不对称性,并反映听力障碍(HI)听众在听力图上显示的听力水平。GESI的一个独特特性是能够将个体参与者的听力条件纳入SI预测。我们在实验室和众包远程环境中对男性和女性语音进行了四项SI实验。然后,我们评估了GESI与传统OIM(STOI、ESTOI、MBSTOI和HASPI)在使用和不使用模拟HL语音条件下预测平均SI和个体SI值的能力。在所有评估中,GESI均优于其他OIM。即使使用模拟HL语音,STOI、ESTOI和MBSTOI也无法预测SI。HASPI未能预测实验室与远程实验中男性语音的SI差异以及个体SI值。GESI可能为仅由外周功能障碍导致HL的个体HI听众的SI预测提供初步步骤。