Recently, utilizing deep neural networks to build the opendomain dialogue models has become a hot topic. However, the responses generated by these models suffer from many problems such as responses not being contextualized and tend to generate generic responses that lack information content, damaging the user's experience seriously. Therefore, many studies try introducing more information into the dialogue models to make the generated responses more vivid and informative. Unlike them, this paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples. In this paper, we first build an open-domain dialogue model based on the pre-trained language model (i.e., GPT-2). And then, an improved scheduled sampling method is proposed for pre-trained models, by which the responses can be used to guide the response generation in the training phase while avoiding the exposure bias problem. More importantly, we design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies. Finally, we evaluate the proposed model (RAD) on the Persona-Chat and DailyDialog datasets; and the experimental results show that our model outperforms the baselines on most automatic and manual metrics.
翻译:近年来,利用深度神经网络构建开放域对话模型已成为研究热点。然而,这些模型生成的回复存在诸多问题,例如缺乏语境关联性且倾向于生成信息量不足的通用回复,严重损害用户体验。为此,大量研究尝试向对话模型中引入更多信息,以使生成回复更具生动性和信息量。不同于这些方法,本文通过挖掘训练样本中上下文与回复间的隐式模式信息,旨在提升生成回复的质量。我们首先构建了一个基于预训练语言模型(即GPT-2)的开放域对话模型。随后,针对预训练模型提出了一种改进的调度采样方法,该方法在训练阶段可利用真实回复引导生成过程,同时避免暴露偏差问题。更重要的是,我们设计了一种响应感知机制来挖掘上下文与回复间的隐式模式信息,使得生成的回复更具多样性且更接近人类回复。最终,我们在Persona-Chat和DailyDialog数据集上对提出的模型(RAD)进行了评估;实验结果表明,该模型在多数自动评估指标与人工评估指标上均优于基线模型。