With the power of large pretrained language models, various research works have integrated knowledge into dialogue systems. The traditional techniques treat knowledge as part of the input sequence for the dialogue system, prepending a set of knowledge statements in front of dialogue history. However, such a mechanism forces knowledge sets to be concatenated in an ordered manner, making models implicitly pay imbalanced attention to the sets during training. In this paper, we first investigate how the order of the knowledge set can influence autoregressive dialogue systems' responses. We conduct experiments on two commonly used dialogue datasets with two types of transformer-based models and find that models view the input knowledge unequally. To this end, we propose a simple and novel technique to alleviate the order effect by modifying the position embeddings of knowledge input in these models. With the proposed position embedding method, the experimental results show that each knowledge statement is uniformly considered to generate responses.
翻译:借助大型预训练语言模型的能力,各种研究工作已将知识整合到对话系统中。传统技术将知识视为对话系统输入序列的一部分,在对话历史前添加一组知识陈述。然而,这种机制强制知识集按顺序拼接,导致模型在训练过程中隐式地对这些知识集分配不均衡的注意力。在本文中,我们首先研究知识集的顺序如何影响自回归对话系统的响应。我们在两个常用对话数据集上使用两种基于Transformer的模型进行实验,发现模型对输入知识存在不平等对待。为此,我们提出一种简单新颖的技术,通过修改这些模型中知识输入的位置嵌入来缓解顺序效应。采用所提出的位置嵌入方法后,实验结果表明,每个知识陈述在生成响应时都被均匀考虑。