Controllable text generation is a challenging and meaningful field in natural language generation (NLG). Especially, poetry generation is a typical one with well-defined and strict conditions for text generation which is an ideal playground for the assessment of current methodologies. While prior works succeeded in controlling either semantic or metrical aspects of poetry generation, simultaneously addressing both remains a challenge. In this paper, we pioneer the use of the Diffusion model for generating sonnets and Chinese SongCi poetry to tackle such challenges. In terms of semantics, our PoetryDiffusion model, built upon the Diffusion model, generates entire sentences or poetry by comprehensively considering the entirety of sentence information. This approach enhances semantic expression, distinguishing it from autoregressive and large language models (LLMs). For metrical control, the separation feature of diffusion generation and its constraint control module enable us to flexibly incorporate a novel metrical controller to manipulate and evaluate metrics (format and rhythm). The denoising process in PoetryDiffusion allows for gradual enhancement of semantics and flexible integration of the metrical controller which can calculate and impose penalties on states that stray significantly from the target control distribution. Experimental results on two datasets demonstrate that our model outperforms existing models in automatic evaluation of semantic, metrical, and overall performance as well as human evaluation.
翻译:可控文本生成是自然语言生成领域中一个具有挑战性和重要意义的课题。其中,诗歌生成因其定义明确、条件严格的文本生成特性,成为评估当前方法的理想试验场。尽管先前的研究成功实现了对诗歌生成中语义或格律单一方面的控制,但如何同时兼顾两者仍是一大难题。本文首次采用扩散模型来生成十四行诗和中国宋词,以应对上述挑战。在语义方面,我们的PoetryDiffusion模型基于扩散模型构建,通过综合考量整句信息来生成完整句子或诗歌。这种方法增强了语义表达,使其与自回归模型和大语言模型区分开来。在格律控制方面,扩散生成的分离特性及其约束控制模块使我们能够灵活地集成一种新颖的格律控制器,用于操控和评估度量指标(格式与韵律)。PoetryDiffusion中的去噪过程能够逐步增强语义,并灵活地整合格律控制器,该控制器可以对严重偏离目标控制分布的状态进行计算并施加惩罚。在两个数据集上的实验结果表明,我们的模型在语义、格律及整体性能的自动评估以及人工评估中均优于现有模型。