As large language models (LLMs) grow in parameter size and capabilities, such as interaction through prompting, they open up new ways of interfacing with automatic speech recognition (ASR) systems beyond rescoring n-best lists. This work investigates post-hoc correction of ASR transcripts with LLMs. To avoid introducing errors into likely accurate transcripts, we propose a range of confidence-based filtering methods. Our results indicate that this can improve the performance of less competitive ASR systems.
翻译:随着大型语言模型(LLM)在参数量与能力(例如通过提示进行交互)方面的持续发展,它们为自动语音识别(ASR)系统提供了超越对N-best列表进行重评分的全新交互方式。本研究探讨了利用LLM对ASR转录文本进行后验校正的方法。为避免在可能已准确的转录文本中引入错误,我们提出了一系列基于置信度的过滤方法。实验结果表明,该方法能够有效提升性能相对较弱的ASR系统的表现。