In the realm of automatic speech recognition (ASR), the quest for models that not only perform with high accuracy but also offer transparency in their decision-making processes is crucial. The potential of quality estimation (QE) metrics is introduced and evaluated as a novel tool to enhance explainable artificial intelligence (XAI) in ASR systems. Through experiments and analyses, the capabilities of the NoRefER (No Reference Error Rate) metric are explored in identifying word-level errors to aid post-editors in refining ASR hypotheses. The investigation also extends to the utility of NoRefER in the corpus-building process, demonstrating its effectiveness in augmenting datasets with insightful annotations. The diagnostic aspects of NoRefER are examined, revealing its ability to provide valuable insights into model behaviors and decision patterns. This has proven beneficial for prioritizing hypotheses in post-editing workflows and fine-tuning ASR models. The findings suggest that NoRefER is not merely a tool for error detection but also a comprehensive framework for enhancing ASR systems' transparency, efficiency, and effectiveness. To ensure the reproducibility of the results, all source codes of this study are made publicly available.
翻译:在自动语音识别(ASR)领域,寻求兼具高精度与决策过程透明度的模型至关重要。本文引入并评估了质量估计(QE)指标作为增强ASR系统可解释人工智能(XAI)的新工具。通过实验与分析,探究了无参考错误率(NoRefER)指标在识别词级错误方面的能力,以辅助译后编辑人员优化ASR假设。研究进一步拓展至NoRefER在语料库构建过程中的实用性,证明了其在通过富有洞察力的标注扩充数据集方面的有效性。通过对NoRefER诊断能力的检验,揭示了其提供模型行为与决策模式深层见解的能力,这已被证明有助于译后编辑工作流中假设的优先级排序及ASR模型的微调。研究结果表明,NoRefER不仅是错误检测工具,更是增强ASR系统透明度、效率与有效性的综合框架。为确保结果的可复现性,本研究所有源代码均已公开。