With the demanding need for deploying dialogue systems in new domains with less cost, zero-shot dialogue state tracking (DST), which tracks user's requirements in task-oriented dialogues without training on desired domains, draws attention increasingly. Although prior works have leveraged question-answering (QA) data to reduce the need for in-domain training in DST, they fail to explicitly model knowledge transfer and fusion for tracking dialogue states. To address this issue, we propose CoFunDST, which is trained on domain-agnostic QA datasets and directly uses candidate choices of slot-values as knowledge for zero-shot dialogue-state generation, based on a T5 pre-trained language model. Specifically, CoFunDST selects highly-relevant choices to the reference context and fuses them to initialize the decoder to constrain the model outputs. Our experimental results show that our proposed model achieves outperformed joint goal accuracy compared to existing zero-shot DST approaches in most domains on the MultiWOZ 2.1. Extensive analyses demonstrate the effectiveness of our proposed approach for improving zero-shot DST learning from QA.
翻译:随着以更低成本在新领域部署对话系统的需求日益增长,零样本对话状态跟踪(DST)——即在无需对目标领域进行训练的情况下跟踪任务导向型对话中用户需求的技术——越来越受到关注。尽管先前的研究已利用问答(QA)数据来减少DST中对领域内训练的需求,但这些方法未能明确建模用于跟踪对话状态的知识转移与融合。为解决这一问题,我们提出了CoFunDST——该模型基于T5预训练语言模型,在领域无关的QA数据集上训练,并直接使用槽值的候选选择作为知识进行零样本对话状态生成。具体而言,CoFunDST选择与参考上下文高度相关的候选选择,并将其融合以初始化解码器,从而约束模型输出。实验结果表明,在MultiWOZ 2.1数据集的大多数领域中,我们提出的模型在联合目标准确率上优于现有零样本DST方法。广泛的分析进一步证明了所提方法在利用QA数据改进零样本DST学习方面的有效性。