Many applications of text generation such as summarization benefit from accurately controlling the text length. Existing approaches on length-controlled summarization either result in degraded performance or can only control the length approximately. In this work, we present a framework to generate summaries with precisely the specified number of tokens or sentences, while maintaining or even improving the text quality. In addition, we jointly train the models to predict the lengths, so our model can generate summaries with optimal length. We evaluate the proposed framework on the CNNDM dataset and show improved performance compared to existing methods.
翻译:文本生成的许多应用(如摘要)受益于精确控制文本长度。现有的长度控制摘要方法要么导致性能下降,要么只能近似控制长度。本文提出一个框架,能够生成精确符合指定词数或句子数的摘要,同时保持甚至提升文本质量。此外,我们联合训练模型以预测长度,从而使模型能生成最优长度的摘要。我们在CNNDM数据集上评估了所提出的框架,结果表明与现有方法相比性能有所提升。