Long-range context modeling is crucial to both dialogue understanding and generation. The most popular method for dialogue context representation is to concatenate the last-$k$ utterances in chronological order. However, this method may not be ideal for conversations containing long-range dependencies, i.e., when there is a need to look beyond last-$k$ utterances to generate a meaningful response. In this work, we propose DialoGen, a novel encoder-decoder based framework for dialogue generation with a generalized context representation that can look beyond the last-$k$ utterances. The main idea of our approach is to identify and utilize the most relevant historical utterances instead of last-$k$, which also enables the compact representation of dialogue history with fewer tokens. We study the effectiveness of our proposed method on both dialogue generation (open-domain) and understanding (DST). Even with a compact context representation, DialoGen performs comparably to the state-of-the-art models on the open-domain DailyDialog dataset. We observe a similar behavior on the DST task of the MultiWOZ dataset when the proposed context representation is applied to existing DST models. We also discuss the generalizability and interpretability of DialoGen and show that the relevance score of previous utterances agrees well with human cognition.
翻译:长程上下文建模对对话理解与生成均至关重要。当前最主流的对话上下文表示方法,是按时间顺序拼接最后k条话语。然而,当对话包含长程依赖关系(即需参考超出最后k条的话语才能生成有意义的回复)时,该方法可能并非最优。本文提出DialoGen,一种基于编码器-解码器框架的新型对话生成模型,其广义上下文表示能突破最后k条话语的限制。该方法的核心思路是识别并利用最相关的历史话语而非最后k条话语,同时实现以更少标记对对话历史进行紧凑表示。我们在对话生成(开放域)与理解(对话状态追踪)任务上验证了该方法的有效性。即便采用紧凑的上下文表示,DialoGen在开放域DailyDialog数据集上的表现仍可媲美现有最优模型。将该上下文表示应用于MultiWOZ数据集的对话状态追踪任务时,我们观察到类似现象。我们还讨论了DialoGen的泛化性与可解释性,并证明历史话语的相关性得分与人类认知高度一致。