Curriculum-Guided Abstractive Summarization

Recent Transformer-based summarization models have provided a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have two shortcomings: (1) they often perform poorly in content selection, and (2) their training strategy is not quite efficient, which restricts model performance. In this paper, we explore two orthogonal ways to compensate for these pitfalls. First, we augment the Transformer network with a sentence cross-attention module in the decoder, encouraging more abstraction of salient content. Second, we include a curriculum learning approach to reweight the training samples, bringing about an efficient learning procedure. Our second approach to enhance the training strategy of Transformers networks makes stronger gains as compared to the first approach. We apply our model on extreme summarization dataset of Reddit TIFU posts. We further look into three cross-domain summarization datasets (Webis-TLDR-17, CNN/DM, and XSum), measuring the efficacy of curriculum learning when applied in summarization. Moreover, a human evaluation is conducted to show the efficacy of the proposed method in terms of qualitative criteria, namely, fluency, informativeness, and overall quality.

翻译：近年来，基于Transformer的摘要模型为抽象式摘要提供了一种有前景的方法。这些模型超越了句子选择和抽取策略，能够处理更复杂的任务，如新词生成和句子改写。然而，这些模型存在两个缺点：（1）在内容选择方面表现不佳，（2）其训练策略效率不高，限制了模型性能。本文探讨了两种正交方式来弥补这些缺陷。首先，我们在解码器中为Transformer网络增加了一个句子交叉注意力模块，鼓励对显著内容的更强抽象。其次，我们引入了一种课程学习方法对训练样本进行重新加权，从而实现高效的学习过程。第二种增强Transformer网络训练策略的方法相比第一种方法带来了更强的改进。我们将模型应用于Reddit TIFU帖子的极端摘要数据集。我们还进一步研究了三个跨领域摘要数据集（Webis-TLDR-17、CNN/DM和XSum），衡量了课程学习在摘要任务中应用的效果。此外，我们进行了人工评估，以证明所提方法在定性标准（即流畅性、信息量和整体质量）方面的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/