Pre-trained language models (PLMs) have accomplished impressive achievements in abstractive single-document summarization (SDS). However, such benefits may not be readily extended to muti-document summarization (MDS), where the interactions among documents are more complex. Previous works either design new architectures or new pre-training objectives for MDS, or apply PLMs to MDS without considering the complex document interactions. While the former does not make full use of previous pre-training efforts and may not generalize well across multiple domains, the latter cannot fully attend to the intricate relationships unique to MDS tasks. In this paper, we enforce hierarchy on both the encoder and decoder and seek to make better use of a PLM to facilitate multi-document interactions for the MDS task. We test our design on 10 MDS datasets across a wide range of domains. Extensive experiments show that our proposed method can achieve consistent improvements on all these datasets, outperforming the previous best models, and even achieving better or competitive results as compared to some models with additional MDS pre-training or larger model parameters.
翻译:预训练语言模型(PLMs)在抽象式单文档摘要(SDS)任务中已取得显著成就。然而,这些优势未必能直接推广至文档间交互更为复杂的多文档摘要(MDS)任务。现有研究或针对MDS设计全新架构与预训练目标,或直接应用PLMs且未考虑文档间的复杂交互。前者未能充分利用先前预训练成果,且跨领域泛化能力有限;后者则无法充分捕捉MDS任务特有的复杂关联关系。本文在编码器与解码器两端均引入分层机制,旨在更高效地利用预训练语言模型促进多文档交互,从而解决MDS任务。我们在涵盖广泛领域的10个MDS数据集上验证了设计方案。大量实验表明,所提方法在所有数据集上均取得一致性提升,不仅超越先前最优模型,甚至相较于部分额外进行MDS预训练或采用更大参数量的模型,仍能获得更优或相当的性能表现。