Intent recognition is critical for task-oriented dialogue systems. However, for emerging domains and new services, it is difficult to accurately identify the key intent of a conversation due to time-consuming data annotation and comparatively poor model transferability. Therefore, the automatic induction of dialogue intention is very important for intelligent dialogue systems. This paper presents our solution to Track 2 of Intent Induction from Conversations for Task-Oriented Dialogue at the Eleventh Dialogue System Technology Challenge (DSTC11). The essence of intention clustering lies in distinguishing the representation of different dialogue utterances. The key to automatic intention induction is that, for any given set of new data, the sentence representation obtained by the model can be well distinguished from different labels. Therefore, we propose a multi-stage coarse-to-fine contrastive learning model training scheme including unsupervised contrastive learning pre-training, supervised contrastive learning pre-training, and fine-tuning with joint contrastive learning and clustering to obtain a better dialogue utterance representation model for the clustering task. In the released DSTC11 Track 2 evaluation results, our proposed system ranked first on both of the two subtasks of this Track.
翻译:意图识别对于任务型对话系统至关重要。然而,对于新兴领域和新服务而言,由于数据标注耗时且模型迁移性相对较差,准确识别对话的关键意图十分困难。因此,对话意图的自动归纳对智能对话系统具有重要意义。本文介绍了我们在第十一届对话系统技术挑战赛(DSTC11)任务导向型对话意图归纳赛道第二轨道的解决方案。意图聚类的核心在于区分不同对话话语的表征。自动意图归纳的关键在于:对于任意给定的新数据集,模型获得的句子表征能够与不同标签形成良好区分。为此,我们提出了一种多阶段从粗到细的对比学习模型训练方案,包括无监督对比学习预训练、有监督对比学习预训练,以及结合对比学习与聚类的微调,从而为聚类任务获得更优的对话话语表征模型。在已发布的DSTC11第二赛道评估结果中,我们提出的系统在该赛道的两个子任务中均排名第一。