Modern abstractive summarization models often generate summaries that contain hallucinated or contradictory information. In this paper, we propose a simple but effective contrastive learning framework that incorporates recent developments in reward learning and factuality metrics. Empirical studies demonstrate that the proposed framework enables summarization models to learn from feedback of factuality metrics using contrastive reward learning, leading to more factual summaries by human evaluations. This suggests that further advances in learning and evaluation algorithms can feed directly into providing more factual summaries.
翻译:现代抽象式摘要模型常生成包含幻觉或矛盾信息的摘要。本文提出一个简洁高效的对比学习框架,融合了奖励学习与事实性评估指标的最新进展。实证研究表明,该框架能使摘要模型通过对比奖励学习从事实性指标的反馈中学习,从而在人工评估中生成更符合事实的摘要。这表明学习与评估算法的进一步发展可直接转化为更真实的摘要生成能力。