The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge selection and response generation separately, and may overlook the inherent correlation between these two tasks, or leverage conditional variational method to jointly optimize knowledge selection and response generation by employing an inference network. In this paper, we present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of selecting knowledge and generating dialogues by approximately sampling from the posterior distribution. Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution. This straightforward and intuitive inference procedure of SPI directly queries the response generation model, allowing for accurate knowledge selection and generation of faithful responses. In addition to modeling contributions, our experimental results on two common dialogue datasets (Wizard of Wikipedia and Holl-E) demonstrate that SPI outperforms previous strong baselines according to both automatic and human evaluation metrics.
翻译:生成兼具多样性和忠实性的事实知识驱动回复,是构建类人且可信对话系统的关键。现有策略通常采用两阶段范式,分别优化知识选择与回复生成,但可能忽略这两项任务的内在关联;或采用条件变分方法,通过推理网络联合优化知识选择与回复生成。本文提出一种名为序贯后验推断(SPI)的端到端学习框架,通过近似后验分布采样实现知识选择与对话生成。与现有方法不同,SPI无需推理网络或假设后验分布具有简单几何结构。该框架直接查询回复生成模型,通过直观简洁的推断过程实现精准知识选择与忠实回复生成。除模型创新外,我们在两个常见对话数据集(Wizard of Wikipedia和Holl-E)上的实验结果表明,SPI在自动评估与人工评估指标上均优于现有强基线方法。