Large-scale Causal Language Models (CLMs), e.g., GPT3 and ChatGPT, have brought great success in text generation. However, it is still an open challenge to control the generation process of CLM while balancing flexibility, control granularity, and generation efficiency. In this paper, we provide a new alternative for controllable text generation (CTG), by designing a non-intrusive, lightweight control plugin to accompany the generation of CLM at arbitrary time steps. The proposed control plugin, namely Residual Memory Transformer (RMT), has an encoder-decoder setup, which can accept any types of control conditions and cooperate with CLM through a residual learning paradigm, to achieve a more flexible, general, and efficient CTG. Extensive experiments are carried out on various control tasks, in the form of both automatic and human evaluations. The results show the superiority of RMT over a range of state-of-the-art approaches, proving the effectiveness and versatility of our approach.
翻译:大型因果语言模型(CLMs),例如GPT3和ChatGPT,已在文本生成领域取得巨大成功。然而,如何在平衡灵活性、控制粒度和生成效率的同时,控制CLM的生成过程仍是一个开放挑战。本文通过设计一种非侵入式、轻量级的控制插件,在任意时间步配合CLM生成,为可控文本生成(CTG)提供了一种新方案。所提出的控制插件——残差记忆Transformer(RMT),采用编码器-解码器结构,可接受任意类型的控制条件,并通过残差学习范式与CLM协作,实现更灵活、通用和高效的CTG。我们在多种控制任务上进行了广泛实验,包括自动评估和人工评估。结果表明,RMT在多种最先进方法中具有优越性,证明了我们方法的有效性和通用性。