The integration of natural language processing (NLP) technologies into educational applications has shown promising results, particularly in the language learning domain. Recently, many spoken open-domain chatbots have been used as speaking partners, helping language learners improve their language skills. However, one of the significant challenges is the high word-error-rate (WER) when recognizing non-native/non-fluent speech, which interrupts conversation flow and leads to disappointment for learners. This paper explores the use of GPT4 for ASR error correction in conversational settings. In addition to WER, we propose to use semantic textual similarity (STS) and next response sensibility (NRS) metrics to evaluate the impact of error correction models on the quality of the conversation. We find that transcriptions corrected by GPT4 lead to higher conversation quality, despite an increase in WER. GPT4 also outperforms standard error correction methods without the need for in-domain training data.
翻译:自然语言处理(NLP)技术在教育应用中的整合已展现出良好前景,尤其在语言学习领域。近年来,许多开放域语音聊天机器人被用作对话伙伴,帮助语言学习者提升语言技能。然而,一个重大挑战在于识别非母语/非流利语音时存在的高词错误率(WER),这会打断对话流畅性并导致学习者失望。本文探讨了GPT4在对话场景中用于自动语音识别(ASR)错误纠正的应用。除WER外,我们提出使用语义文本相似度(STS)和下一回应合理性(NRS)指标来评估错误纠正模型对对话质量的影响。研究发现,尽管GPT4纠正后的转录文本导致WER上升,但对话质量反而更高。GPT4无需领域内训练数据即可超越标准错误纠正方法。