Summarization refinement faces challenges when extending to multi-dimension. In this paper, we introduce ReFeed, a powerful summarization refinement pipeline that enhances multiple dimensions through reflective reasoning on feedback. To achieve this, we release SumFeed-CoT, a large-scale Long-CoT-based dataset optimized for training a lightweight model with reflective reasoning. Our experiments reveal how the number of dimensions, feedback exposure, and reasoning policy influence refinement performance, highlighting reflective reasoning and simultaneously addressing multiple feedback is crucial to mitigate trade-off between dimensions. Furthermore, ReFeed is robust to noisy feedback and feedback order. Lastly, our finding emphasizes that creating data with a proper goal and guideline constitutes a fundamental pillar of effective reasoning. The dataset and model will be released.
翻译:摘要精炼在扩展到多维度时面临挑战。本文提出ReFeed,一种通过反馈反思推理增强多维度性能的强大摘要精炼流程。为此,我们发布了SumFeed-CoT——一个基于长链思维的大规模数据集,专为训练具有反思推理能力的轻量级模型而优化。实验揭示了维度数量、反馈暴露程度和推理策略如何影响精炼性能,结果表明反思推理与同时处理多维度反馈对缓解维度间权衡至关重要。此外,ReFeed对噪声反馈和反馈顺序具有鲁棒性。最后,我们的研究强调:以恰当目标和指导原则构建数据是形成有效推理能力的基础支柱。数据集与模型将开源发布。