In the realm of automatic speech recognition (ASR), the quest for models that not only perform with high accuracy but also offer transparency in their decision-making processes is crucial. The potential of quality estimation (QE) metrics is introduced and evaluated as a novel tool to enhance explainable artificial intelligence (XAI) in ASR systems. Through experiments and analyses, the capabilities of the NoRefER (No Reference Error Rate) metric are explored in identifying word-level errors to aid post-editors in refining ASR hypotheses. The investigation also extends to the utility of NoRefER in the corpus-building process, demonstrating its effectiveness in augmenting datasets with insightful annotations. The diagnostic aspects of NoRefER are examined, revealing its ability to provide valuable insights into model behaviors and decision patterns. This has proven beneficial for prioritizing hypotheses in post-editing workflows and fine-tuning ASR models. The findings suggest that NoRefER is not merely a tool for error detection but also a comprehensive framework for enhancing ASR systems' transparency, efficiency, and effectiveness. To ensure the reproducibility of the results, all source codes of this study are made publicly available.
翻译:在自动语音识别(ASR)领域,模型不仅要实现高精度,还需具备决策过程的透明度,这一点至关重要。本文引入并评估了质量估计(QE)度量作为增强ASR系统可解释人工智能(XAI)的新型工具。通过实验与分析,探讨了NoRefER(无参考错误率)度量在识别词汇级错误方面的能力,以辅助译后编辑优化ASR假设。研究进一步延伸至NoRefER在语料库构建过程中的实用性,证实其能够通过富含洞察力的标注有效扩充数据集。本文考察了NoRefER的诊断特性,揭示其为模型行为与决策模式提供有价值见解的能力。这对译后编辑流程中的假设优先级划分及ASR模型微调具有显著裨益。研究结果表明,NoRefER不仅是错误检测工具,更是提升ASR系统透明度、效率与有效性的综合性框架。为确保结果可复现,本研究所有源代码均已公开。