Pre-trained language models have been successful in many scenarios. However, their usefulness in task-oriented dialogues is limited due to the intrinsic linguistic differences between general text and task-oriented dialogues. Current task-oriented dialogue pre-training methods rely on a contrastive framework, which faces challenges such as selecting true positives and hard negatives, as well as lacking diversity. In this paper, we propose a novel dialogue pre-training model called BootTOD. It learns task-oriented dialogue representations via a self-bootstrapping framework. Unlike contrastive counterparts, BootTOD aligns context and context+response representations and dismisses the requirements of contrastive pairs. BootTOD also uses multiple appropriate response targets to model the intrinsic one-to-many diversity of human conversations. Experimental results show that BootTOD outperforms strong TOD baselines on diverse downstream dialogue tasks.
翻译:预训练语言模型在许多场景中取得了成功。然而,由于通用文本与面向任务对话之间固有的语言差异,它们在面向任务对话中的实用性受到限制。当前的面向任务对话预训练方法依赖于对比学习框架,这类方法面临真阳性样本与难负样本选取的挑战,同时缺乏多样性。本文提出一种新颖的对话预训练模型BootTOD。该模型通过自举框架学习面向任务的对话表示。与对比学习方法不同,BootTOD对齐了上下文与上下文+回应的表示,消除了对对比对的需求。同时,BootTOD利用多个恰当的回应对目标来建模人类对话中固有的“一对多”多样性。实验结果表明,BootTOD在多种下游对话任务上优于强TOD基线模型。