Entity abstract summarization aims to generate a coherent description of a given entity based on a set of relevant Internet documents. Pretrained language models (PLMs) have achieved significant success in this task, but they may suffer from hallucinations, i.e. generating non-factual information about the entity. To address this issue, we decompose the summary into two components: Facts that represent the factual information about the given entity, which PLMs are prone to fabricate; and Template that comprises generic content with designated slots for facts, which PLMs can generate competently. Based on the facts-template decomposition, we propose SlotSum, an explainable framework for entity abstract summarization. SlotSum first creates the template and then predicts the fact for each template slot based on the input documents. Benefiting from our facts-template decomposition, SlotSum can easily locate errors and further rectify hallucinated predictions with external knowledge. We construct a new dataset WikiFactSum to evaluate the performance of SlotSum. Experimental results demonstrate that SlotSum could generate summaries that are significantly more factual with credible external knowledge.
翻译:实体摘要生成旨在基于一组相关的互联网文档,为给定实体生成连贯的描述。预训练语言模型在该任务中取得了显著成功,但可能产生幻觉,即生成关于实体的非事实性信息。为解决这一问题,我们将摘要分解为两个组成部分:事实,代表关于给定实体的事实信息(这是预训练语言模型容易捏造的部分);以及模板,包含带有指定事实槽位的通用内容(这是预训练语言模型能胜任生成的部分)。基于事实-模板分解,我们提出了SlotSum——一种面向实体摘要的可解释框架。SlotSum首先创建模板,然后根据输入文档预测每个模板槽位的事实。得益于我们的事实-模板分解,SlotSum能够轻松定位错误,并进一步利用外部知识修正生成的幻觉预测。我们构建了新数据集WikiFactSum来评估SlotSum的性能。实验结果表明,SlotSum能够借助可信的外部知识生成事实性显著增强的摘要。