Empathetic dialogue requires not only recognizing a user's emotional state but also making strategy-aware, context-sensitive decisions throughout response generation. However, the lack of a comprehensive empathy strategy framework, explicit task-aligned multi-stage reasoning, and high-quality strategy-aware data fundamentally limits existing approaches, preventing them from effectively modeling empathetic dialogue as a complex, multi-stage cognitive and decision-making process. To address these challenges, we propose STRIDE-ED, a STRategy-grounded, Interpretable, and DEep reasoning framework that models Empathetic Dialogue through structured, strategy-conditioned reasoning. To support effective learning, we develop a strategy-aware data refinement pipeline integrating LLM-based annotation, multi-model consistency-weighted evaluation, and dynamic sampling to construct high-quality training data aligned with empathetic strategies. Furthermore, we adopt a two-stage training paradigm that combines supervised fine-tuning with multi-objective reinforcement learning to better align model behaviors with target emotions, empathetic strategies, and response formats. Extensive experiments demonstrate that STRIDE-ED generalizes across diverse open-source LLMs and consistently outperforms existing methods on both automatic metrics and human evaluations.
翻译:共情对话不仅需要识别用户的情绪状态,还要求在整个回复生成过程中做出具有策略意识且情境敏感的决策。然而,现有方法因缺乏全面的共情策略框架、显式的任务对齐多阶段推理以及高质量的策略感知数据,从根本上受到限制,无法将共情对话有效建模为一个复杂的多阶段认知与决策过程。为应对这些挑战,我们提出STRIDE-ED,一种基于策略、可解释且深度推理的框架,通过结构化且受策略约束的推理对共情对话进行建模。为支持有效学习,我们开发了一种策略感知的数据精炼流程,结合基于大语言模型的标注、多模型一致性加权评估与动态采样,构建与共情策略对齐的高质量训练数据。此外,我们采用两阶段训练范式,将监督微调与多目标强化学习相结合,以更好地使模型行为与目标情绪、共情策略及回复格式对齐。大量实验表明,STRIDE-ED能够在多种开源大语言模型上泛化,并在自动评估指标与人工评估中均稳定优于现有方法。