Recently, research on open domain dialogue systems have attracted extensive interests of academic and industrial researchers. The goal of an open domain dialogue system is to imitate humans in conversations. Previous works on single turn conversation generation have greatly promoted the research of open domain dialogue systems. However, understanding multiple single turn conversations is not equal to the understanding of multi turn dialogue due to the coherent and context dependent properties of human dialogue. Therefore, in open domain multi turn dialogue generation, it is essential to modeling the contextual semantics of the dialogue history, rather than only according to the last utterance. Previous research had verified the effectiveness of the hierarchical recurrent encoder-decoder framework on open domain multi turn dialogue generation. However, using RNN-based model to hierarchically encoding the utterances to obtain the representation of dialogue history still face the problem of a vanishing gradient. To address this issue, in this paper, we proposed a static and dynamic attention-based approach to model the dialogue history and then generate open domain multi turn dialogue responses. Experimental results on Ubuntu and Opensubtitles datasets verify the effectiveness of the proposed static and dynamic attention-based approach on automatic and human evaluation metrics in various experimental settings. Meanwhile, we also empirically verify the performance of combining the static and dynamic attentions on open domain multi turn dialogue generation.
翻译:近年来,开放域对话系统的研究引起了学术界与工业界的广泛关注。开放域对话系统的目标在于模仿人类进行自然对话。先前针对单轮对话生成的研究极大地推动了开放域对话系统的发展。然而,由于人类对话具有连贯性与上下文依赖性,对多个独立单轮对话的理解并不等同于对多轮对话的理解。因此,在开放域多轮对话生成中,必须对对话历史的上下文语义进行建模,而非仅依据最后一句话语。已有研究验证了分层循环编码器-解码器框架在开放域多轮对话生成中的有效性。然而,基于RNN的模型通过分层编码话语以获取对话历史表示时,仍面临梯度消失问题。为解决该问题,本文提出一种融合静态与动态注意力的方法,通过对对话历史建模实现开放域多轮对话响应生成。在Ubuntu与Opensubtitles数据集上的实验结果表明,该静态与动态注意力融合方法在不同实验设置下,于自动评估与人工评估指标上均表现出有效性。同时,我们通过实证验证了静态与动态注意力结合在开放域多轮对话生成中的性能表现。