In task-oriented dialogue systems, spoken language understanding (SLU) is a critical component, which consists of two sub-tasks, intent detection and slot filling. Most existing methods focus on the single-intent SLU, where each utterance only has one intent. However, in real-world scenarios users usually express multiple intents in an utterance, which poses a challenge for existing dialogue systems and datasets. In this paper, we propose a generative framework to simultaneously address multiple intent detection and slot filling. In particular, an attention-over-attention decoder is proposed to handle the variable number of intents and the interference between the two sub-tasks by incorporating an inductive bias into the process of multi-task learning. Besides, we construct two new multi-intent SLU datasets based on single-intent utterances by taking advantage of the next sentence prediction (NSP) head of the BERT model. Experimental results demonstrate that our proposed attention-over-attention generative model achieves state-of-the-art performance on two public datasets, MixATIS and MixSNIPS, and our constructed datasets.
翻译:在面向任务的对话系统中,口语理解是一个关键组成部分,它包含意图检测与槽填充两个子任务。现有方法大多聚焦于单意图口语理解,即每个话语仅包含一个意图。然而,在实际场景中,用户通常会在单个话语中表达多个意图,这对现有对话系统与数据集构成了挑战。本文提出一种生成式框架以同时处理多意图检测与槽填充任务。具体而言,我们设计了一种注意力叠加解码器,通过在多任务学习过程中引入归纳偏置,以处理意图数量可变的问题并缓解两个子任务间的相互干扰。此外,我们利用BERT模型的下一句预测头,基于单意图话语构建了两个新的多意图口语理解数据集。实验结果表明,我们提出的注意力叠加生成模型在两个公开数据集(MixATIS与MixSNIPS)及自建数据集上均取得了最先进的性能。