The topic-to-essay generation task is a challenging natural language generation task that aims to generate paragraph-level text with high semantic coherence based on a given set of topic words. Previous work has focused on the introduction of external knowledge, ignoring the insufficient generated text diversity. In order to improve the generation diversity, we propose a novel copy mechanism model with a content selection module that integrates rich semantic knowledge from the language model into the decoder. Furthermore, we introduce the improved prefix tuning method to train the model, enabling it to adapt to varying input complexities. In addition, we have contributed a new Chinese dataset for TEG tasks. Experimental results demonstrate that the proposed model can improve the generated text diversity by 35\% to 59\% compared to the state-of-the-art method, while maintaining a high level of topic consistency.
翻译:主题到文章生成任务是一项具有挑战性的自然语言生成任务,旨在根据给定的一组主题词生成具有高语义连贯性的段落级文本。以往的工作集中在引入外部知识,忽略了生成文本多样性的不足。为了提高生成多样性,我们提出了一种新颖的复制机制模型,该模型包含一个内容选择模块,将来自语言模型的丰富语义知识集成到解码器中。此外,我们引入了改进的前缀调优方法来训练模型,使其能够适应不同的输入复杂性。另外,我们为TEG任务贡献了一个新的中文数据集。实验结果表明,与最先进的方法相比,所提出的模型可以将生成文本的多样性提高35%到59%,同时保持高水平的主题一致性。