Recently, contrastive learning attracts increasing interests in neural text generation as a new solution to alleviate the exposure bias problem. It introduces a sequence-level training signal which is crucial to generation tasks that always rely on auto-regressive decoding. However, previous methods using contrastive learning in neural text generation usually lead to inferior performance. In this paper, we analyse the underlying reasons and propose a new Contrastive Neural Text generation framework, CoNT. CoNT addresses bottlenecks that prevent contrastive learning from being widely adopted in generation tasks from three aspects -- the construction of contrastive examples, the choice of the contrastive loss, and the strategy in decoding. We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation. Experimental results show that CoNT clearly outperforms the conventional training framework on all the ten benchmarks with a convincing margin. Especially, CoNT surpasses previous the most competitive contrastive learning method for text generation, by 1.50 BLEU on machine translation and 1.77 ROUGE-1 on summarization, respectively. It achieves new state-of-the-art on summarization, code comment generation (without external data) and data-to-text generation.
翻译:近期,对比学习作为缓解暴露偏差问题的新方案,在神经文本生成领域吸引了越来越多的关注。它引入了序列级训练信号,这对始终依赖自回归解码的生成任务至关重要。然而,先前在神经文本生成中使用对比学习的方法通常导致较差的性能。本文分析了根本原因,并提出了一种新的对比神经文本生成框架CoNT。CoNT从三个方面解决了阻碍对比学习在生成任务中广泛应用的瓶颈——对比样本的构建、对比损失的选择以及解码策略。我们在五个生成任务(包括机器翻译、摘要生成、代码注释生成、数据到文本生成和常识生成)的十个基准上验证了CoNT。实验结果表明,CoNT在所有十个基准上均以令人信服的幅度明显优于传统训练框架。特别地,CoNT在机器翻译任务上BLEU值超越此前最具竞争力的文本生成对比学习方法1.50,在摘要生成任务上ROUGE-1值超越1.77。它在摘要生成、代码注释生成(无外部数据)和数据到文本生成任务上达到了新的最先进水平。