Large language models (LLMs) have been widely applied to emotional support conversation (ESC). However, complex multi-turn support remains challenging.This is because existing alignment schemes rely on sparse outcome-level signals, thus offering limited supervision for intermediate strategy decisions. To fill this gap, this paper proposes affective flow language model for emotional support conversation (AFlow), a framework that introduces fine-grained supervision on dialogue prefixes by modeling a continuous affective flow along multi-turn trajectories. AFlow can estimate intermediate utility over searched trajectories and learn preference-consistent strategy transitions. To improve strategy coherence and empathetic response quality, a subpath-level flow-balance objective is presented to propagate preference signals to intermediate states. Experiment results show consistent and significant improvements over competitive baselines in diverse emotional contexts. Remarkably, AFlow with a compact open-source backbone outperforms proprietary LMMs such as GPT-4o and Claude-3.5 on major ESC metrics. Our code is available at https://github.com/chzou25-lgtm/AffectiveFlow.
翻译:大型语言模型(LLMs)已被广泛应用于情感支持对话(ESC)。然而,复杂的多轮支持任务仍具挑战性,这是因为现有的对齐方案依赖于稀疏的结果级信号,从而对中间策略决策的监督有限。为填补这一空白,本文提出用于情感支持对话的情感流语言模型(AFlow),该框架通过沿多轮对话轨迹建模连续的情感流,为对话前缀引入细粒度监督。AFlow能够估计搜索轨迹上的中间效用,并学习偏好一致的策略转移。为提高策略连贯性与共情回应的质量,本文提出一种子路径级的流平衡目标,将偏好信号传播至中间状态。实验结果表明,在多样化的情感情境中,该方法相较于现有竞争基线均取得一致且显著的性能提升。值得注意的是,基于紧凑开源骨干网络的AFlow在主要ESC指标上超越了GPT-4o、Claude-3.5等专有LMMs。我们的代码公开于https://github.com/chzou25-lgtm/AffectiveFlow。