Empathy is a crucial factor in open-domain conversations, which naturally shows one's caring and understanding to others. Though several methods have been proposed to generate empathetic responses, existing works often lead to monotonous empathy that refers to generic and safe expressions. In this paper, we propose to use explicit control to guide the empathy expression and design a framework DiffusEmp based on conditional diffusion language model to unify the utilization of dialogue context and attribute-oriented control signals. Specifically, communication mechanism, intent, and semantic frame are imported as multi-grained signals that control the empathy realization from coarse to fine levels. We then design a specific masking strategy to reflect the relationship between multi-grained signals and response tokens, and integrate it into the diffusion model to influence the generative process. Experimental results on a benchmark dataset EmpatheticDialogue show that our framework outperforms competitive baselines in terms of controllability, informativeness, and diversity without the loss of context-relatedness.
翻译:共情是开放域对话中的关键因素,自然体现对他人的关怀与理解。尽管已有多种方法被提出用于生成共情回复,但现有工作通常导致单调的共情,即采用泛化且安全的表达。本文提出使用显式控制来引导共情表达,并基于条件扩散语言模型设计了一个名为DiffusEmp的框架,以统一利用对话上下文和面向属性的控制信号。具体而言,我们引入交流机制、意图和语义框架作为从粗粒度到细粒度控制共情实现的多粒度信号。随后设计了一种特定的遮蔽策略来反映多粒度信号与回复词元之间的关系,并将其集成到扩散模型中影响生成过程。在基准数据集EmpatheticDialogue上的实验结果表明,本框架在可控性、信息量和多样性方面均优于竞争基线,且未损失上下文关联性。