Joint Multiple Intent Detection and Slot Filling with Supervised Contrastive Learning and Self-Distillation

Multiple intent detection and slot filling are two fundamental and crucial tasks in spoken language understanding. Motivated by the fact that the two tasks are closely related, joint models that can detect intents and extract slots simultaneously are preferred to individual models that perform each task independently. The accuracy of a joint model depends heavily on the ability of the model to transfer information between the two tasks so that the result of one task can correct the result of the other. In addition, since a joint model has multiple outputs, how to train the model effectively is also challenging. In this paper, we present a method for multiple intent detection and slot filling by addressing these challenges. First, we propose a bidirectional joint model that explicitly employs intent information to recognize slots and slot features to detect intents. Second, we introduce a novel method for training the proposed joint model using supervised contrastive learning and self-distillation. Experimental results on two benchmark datasets MixATIS and MixSNIPS show that our method outperforms state-of-the-art models in both tasks. The results also demonstrate the contributions of both bidirectional design and the training method to the accuracy improvement. Our source code is available at https://github.com/anhtunguyen98/BiSLU

翻译：多意图检测与槽位填充是口语理解中两个基础且关键的任务。鉴于这两个任务密切关联，能够同时检测意图并提取槽位的联合模型优于独立执行每个任务的单任务模型。联合模型的准确性高度依赖于其在两个任务间传递信息的能力，使得一个任务的结果能够修正另一个任务的输出。此外，由于联合模型具有多个输出，如何有效训练该模型也颇具挑战性。本文通过解决这些挑战，提出了一种用于多意图检测与槽位填充的方法。首先，我们设计了一个双向联合模型，该模型显式利用意图信息识别槽位，并利用槽位特征检测意图。其次，我们引入了一种新颖的训练方法，通过监督对比学习与自蒸馏来训练所提出的联合模型。在两个基准数据集MixATIS和MixSNIPS上的实验结果表明，我们的方法在两个任务上均优于现有最优模型。实验结果还证明了双向设计及训练方法对准确性提升的贡献。我们的源代码可在https://github.com/anhtunguyen98/BiSLU获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日