Do Multi-Document Summarization Models Synthesize?

from arxiv, Accepted to TACL, to be presented at ACL 2024 in Bangkok, Thailand. 9 Figures, 11 Tables, 14 pages of main content, 20 pages total. This paper has some _history_. Buy me a drink if you want to hear about it

Multi-document summarization entails producing concise synopses of collections of inputs. For some applications, the synopsis should accurately synthesize inputs with respect to a key aspect, e.g., a synopsis of film reviews written about a particular movie should reflect the average critic consensus. As a more consequential example, narrative summaries that accompany biomedical systematic reviews of clinical trial results should accurately summarize the potentially conflicting results from individual trials. In this paper we ask: To what extent do modern multi-document summarization models implicitly perform this sort of synthesis? We run experiments over opinion and evidence synthesis datasets using a suite of summarization models, from fine-tuned transformers to GPT-4. We find that existing models partially perform synthesis, but imperfectly: even the best performing models are over-sensitive to changes in input ordering and under-sensitive to changes in input compositions (e.g., ratio of positive to negative reviews). We propose a simple, general, effective method for improving model synthesis capabilities by generating an explicitly diverse set of candidate outputs, and then selecting from these the string best aligned with the expected aggregate measure for the inputs, or abstaining when the model produces no good candidate.

翻译：多文档摘要旨在为输入文档集合生成简洁的概要。在某些应用场景中，摘要应能针对关键方面准确合成输入信息，例如，针对某部电影的影评摘要应反映评论界的平均共识。一个更具现实意义的例子是，伴随生物医学系统性综述（针对临床试验结果）的叙述性摘要，应能准确总结各独立试验中可能相互矛盾的结果。本文探讨：现代多文档摘要模型在多大程度上隐式执行此类合成操作？我们基于观点合成与证据合成数据集开展实验，使用了一系列摘要模型（从微调后的Transformer模型到GPT-4）。研究发现：现有模型能部分实现合成功能，但存在不足——即使性能最优的模型也对输入顺序的变动过度敏感，同时对输入构成的变化（如正负面评价比例）敏感性不足。我们提出一种简单、通用且有效的方法来提升模型合成能力：首先生成一组显式多样化的候选输出，随后从中选择与输入预期聚合度量最匹配的文本，或在模型未生成合适候选时主动放弃输出。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日