In a spoken dialogue system, an NLU model is preceded by a speech recognition system that can deteriorate the performance of natural language understanding. This paper proposes a method for investigating the impact of speech recognition errors on the performance of natural language understanding models. The proposed method combines the back transcription procedure with a fine-grained technique for categorizing the errors that affect the performance of NLU models. The method relies on the usage of synthesized speech for NLU evaluation. We show that the use of synthesized speech in place of audio recording does not change the outcomes of the presented technique in a significant way.
翻译:在口语对话系统中,自然语言理解(NLU)模型之前通常连接语音识别系统,而后者可能降低自然语言理解的性能。本文提出一种研究语音识别错误对自然语言理解模型性能影响的方法。该方法将回译过程与影响NLU模型性能的细粒度错误分类技术相结合,其核心在于使用合成语音进行NLU评估。我们证明,使用合成语音替代真实录音并不会显著改变所提出技术的结果。