Objective assessment of speech that reflects meaningful changes in communication is crucial for clinical decision making and reproducible research. While existing objective assessments, particularly reference-based approaches, can capture intelligibility changes, they are often hindered by lack of explainability and the need for labor-intensive manual transcriptions. To address these issues, this work proposes the reference-free, explainable ASR Inconsistency Score. We evaluate this method on pathological speech in Dutch, Spanish and English, and compare its performance to a reference-based Word Error Rate (WER) baseline. Our results demonstrate that the ASR Inconsistency Score achieves a high correlation with expert perceptual ratings, with performance closely matching, and in one case exceeding, a standard reference-based Word Error Rate (WER) baseline.
翻译:客观评估能够反映沟通中有意义变化的语音,对于临床决策和可重复研究至关重要。现有的客观评估方法,特别是基于参考的方法,虽然能够捕捉清晰度的变化,但常常受到缺乏可解释性以及需要耗费大量人力的手动转录的限制。为解决这些问题,本研究提出了无参考、可解释的ASR不一致性评分。我们在荷兰语、西班牙语和英语的病理语音上评估了该方法,并将其性能与基于参考的词错误率基线进行了比较。我们的结果表明,ASR不一致性评分与专家感知评分具有高度相关性,其性能与标准的基于参考的词错误率基线非常接近,并且在一种情况下超过了该基线。