Recently there have been efforts to introduce new benchmark tasks for spoken language understanding (SLU), like semantic parsing. In this paper, we describe our proposed spoken semantic parsing system for the quality track (Track 1) in Spoken Language Understanding Grand Challenge which is part of ICASSP Signal Processing Grand Challenge 2023. We experiment with both end-to-end and pipeline systems for this task. Strong automatic speech recognition (ASR) models like Whisper and pretrained Language models (LM) like BART are utilized inside our SLU framework to boost performance. We also investigate the output level combination of various models to get an exact match accuracy of 80.8, which won the 1st place at the challenge.
翻译:近年来,研究人员致力于为口语语言理解(SLU)引入新的基准任务,例如语义解析。本文描述了我们在ICASSP信号处理大挑战2023中的口语语言理解大挑战质量赛道(Track 1)中所提出的口语语义解析系统。我们针对该任务尝试了端到端和流水线两种系统。在SLU框架中,我们利用了Whisper等强自动语音识别(ASR)模型和BART等预训练语言模型(LM)来提升性能。此外,我们研究了多种模型输出级别的组合,最终取得了80.8的精确匹配准确率,在该挑战中荣获第一名。