This study investigates how surgical intervention for speech pathology (specifically, as a result of oral cancer surgery) impacts the performance of an automatic speaker verification (ASV) system. Using two recently collected Dutch datasets with parallel pre and post-surgery audio from the same speaker, NKI-OC-VC and SPOKE, we assess the extent to which speech pathology influences ASV performance, and whether objective/subjective measures of speech severity are correlated with the performance. Finally, we carry out a perceptual study to compare judgements of ASV and human listeners. Our findings reveal that pathological speech negatively affects ASV performance, and the severity of the speech is negatively correlated with the performance. There is a moderate agreement in perceptual and objective scores of speaker similarity and severity, however, we could not clearly establish in the perceptual study, whether the same phenomenon also exists in human perception.
翻译:本研究探讨针对语音病理(特别是口腔癌手术所致)的外科干预如何影响自动说话人验证系统的性能。利用近期收集的两个荷兰语数据集——NKI-OC-VC与SPOKE(均包含同一说话者术前与术后的平行音频),我们评估了语音病理对ASV性能的影响程度,并检验语音严重程度的主客观度量是否与性能存在相关性。最后,我们开展了一项感知实验,以比较ASV系统与人类听者的判断差异。研究结果表明:病理语音对ASV性能产生负面影响,且语音严重程度与性能呈负相关。在说话人相似度与严重程度的感知评分与客观评分之间存在中等程度的一致性,然而在感知实验中,我们未能明确验证同一现象是否也存在于人类感知中。