Curriculum-Guided Abstractive Summarization

Recent Transformer-based summarization models have provided a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have two shortcomings: (1) they often perform poorly in content selection, and (2) their training strategy is not quite efficient, which restricts model performance. In this paper, we explore two orthogonal ways to compensate for these pitfalls. First, we augment the Transformer network with a sentence cross-attention module in the decoder, encouraging more abstraction of salient content. Second, we include a curriculum learning approach to reweight the training samples, bringing about an efficient learning procedure. Our second approach to enhance the training strategy of Transformers networks makes stronger gains as compared to the first approach. We apply our model on extreme summarization dataset of Reddit TIFU posts. We further look into three cross-domain summarization datasets (Webis-TLDR-17, CNN/DM, and XSum), measuring the efficacy of curriculum learning when applied in summarization. Moreover, a human evaluation is conducted to show the efficacy of the proposed method in terms of qualitative criteria, namely, fluency, informativeness, and overall quality.

翻译：基于Transformer的近期摘要模型为抽象式摘要提供了一种有前景的方法。这些模型超越了句子选择和抽取式策略，以处理更复杂的任务，如新词生成和句子改写。然而，这些模型存在两个缺陷：（1）它们在内容选择方面表现不佳，（2）其训练策略效率不高，限制了模型性能。本文探讨了两种正交方法来弥补这些不足。首先，我们在解码器中为Transformer网络增加了一个句子交叉注意力模块，以促进对显著内容的更抽象化处理。其次，我们引入课程学习方法对训练样本进行重新加权，从而实现高效的学习过程。第二种增强Transformer网络训练策略的方法相比第一种方法取得了更显著的改进。我们将模型应用于Reddit TIFU帖子的极端摘要数据集，并进一步研究了三个跨领域摘要数据集（Webis-TLDR-17、CNN/DM和XSum），衡量了课程学习在摘要任务中的应用效果。此外，我们进行了人工评估，以展示所提方法在流畅性、信息量和整体质量等定性指标上的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日