Controllable Text Generation (CTG) is emerging area in the field of natural language generation (NLG). It is regarded as crucial for the development of advanced text generation technologies that better meet the specific constraints in practical applications. In recent years, methods using large-scale pre-trained language models (PLMs), in particular the widely used transformer-based PLMs, have become a new paradigm of NLG, allowing generation of more diverse and fluent text. However, due to the limited level of interpretability of deep neural networks, the controllability of these methods need to be guaranteed. To this end, controllable text generation using transformer-based PLMs has become a rapidly growing yet challenging new research hotspot. A diverse range of approaches have emerged in the recent 3-4 years, targeting different CTG tasks that require different types of controlled constraints. In this paper, we present a systematic critical review on the common tasks, main approaches, and evaluation methods in this area. Finally, we discuss the challenges that the field is facing, and put forward various promising future directions. To the best of our knowledge, this is the first survey paper to summarize the state-of-the-art CTG techniques from the perspective of Transformer-based PLMs. We hope it can help researchers and practitioners in the related fields to quickly track the academic and technological frontier, providing them with a landscape of the area and a roadmap for future research.
翻译:可控文本生成(CTG)是自然语言生成(NLG)领域的新兴方向。该技术对开发更贴合实际应用约束的先进文本生成技术至关重要。近年来,基于大规模预训练语言模型(PLM),特别是广泛使用的基于Transformer的预训练语言模型的方法,已成为自然语言生成的新范式,能够生成更具多样性和流畅性的文本。然而,由于深度神经网络的可解释性有限,这些方法的可控性仍需保障。为此,基于Transformer的预训练语言模型实现可控文本生成已成为一个快速发展的新研究热点,同时兼具挑战性。近三至四年间涌现出多种方法,针对需要不同约束类型的可控文本生成任务。本文系统性地对该领域常见任务、主要方法及评估方法进行了批判性综述。最后,我们讨论了该领域面临的挑战,并提出了若干有前景的未来方向。据我们所知,这是首篇从Transformer预训练语言模型视角总结可控文本生成前沿技术的综述论文。我们期望本文能帮助相关领域的研究者与实践者快速把握学术与技术前沿,为其提供该领域的研究全景与未来研究路线图。