Implicit knowledge, such as common sense, is key to fluid human conversations. Current neural response generation (RG) models are trained to generate responses directly, omitting unstated implicit knowledge. In this paper, we present Think-Before-Speaking (TBS), a generative approach to first externalize implicit commonsense knowledge (think) and use this knowledge to generate responses (speak). We expect that externalizing implicit knowledge allows more efficient learning, produces more informative responses, and enables more explainable models. We analyze different choices to collect knowledge-aligned dialogues, represent implicit knowledge, and transition between knowledge and dialogues. Empirical results show TBS models outperform end-to-end and knowledge-augmented RG baselines on most automatic metrics and generate more informative, specific, and commonsense-following responses, as evaluated by human annotators. TBS also generates knowledge that makes sense and is relevant to the dialogue around 85\% of the time.
翻译:隐含知识(如常识)是流畅人际对话的关键。当前的神经对话生成模型直接训练生成回复,忽略了未明示的隐含知识。本文提出“三思而后言”(Think-Before-Speaking,TBS)生成式方法:首先外化隐含常识知识(思),再利用这些知识生成回复(言)。我们预期外化隐含知识能提升学习效率、生成更具信息量的回复,并增强模型可解释性。我们分析了多种方案:收集知识对齐对话、表示隐含知识、以及实现知识与对话间的转换。实验结果表明,TBS模型在多数自动评估指标上优于端到端及知识增强的回复生成基线模型,且经人工评估,其生成的回复更具信息性、特异性及常识一致性。TBS生成的知识在约85%的案例中具有合理性且与对话相关。