Recently, various neural encoder-decoder models pioneered by Seq2Seq framework have been proposed to achieve the goal of generating more abstractive summaries by learning to map input text to output text. At a high level, such neural models can freely generate summaries without any constraint on the words or phrases used. Moreover, their format is closer to human-edited summaries and output is more readable and fluent. However, the neural model's abstraction ability is a double-edged sword. A commonly observed problem with the generated summaries is the distortion or fabrication of factual information in the article. This inconsistency between the original text and the summary has caused various concerns over its applicability, and the previous evaluation methods of text summarization are not suitable for this issue. In response to the above problems, the current research direction is predominantly divided into two categories, one is to design fact-aware evaluation metrics to select outputs without factual inconsistency errors, and the other is to develop new summarization systems towards factual consistency. In this survey, we focus on presenting a comprehensive review of these fact-specific evaluation methods and text summarization models.
翻译:最近,以Seq2Seq框架为代表的多种神经编码器-解码器模型被提出,旨在通过学习输入文本到输出文本的映射,生成更具抽象性的摘要。从宏观角度来看,这类神经模型能够自由生成摘要,不受词汇或短语使用的任何限制。此外,其格式更接近人工编辑的摘要,输出结果更具可读性和流畅性。然而,神经模型的抽象能力是一把双刃剑。生成摘要中一个普遍存在的问题是对原文事实信息的扭曲或捏造。这种原文与摘要之间的不一致性引发了对其适用性的各种担忧,而以往文本摘要的评估方法无法解决这一问题。针对上述问题,当前研究方向主要分为两类:一是设计事实感知的评估指标,以筛选出无事实不一致错误的输出;二是开发面向事实一致性的新型摘要生成系统。在本综述中,我们聚焦于对这些特定于事实的评估方法及文本摘要模型进行全面的回顾。