Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence. However, as we show in this paper, previous attempts on this approach to text generation all fail to sample correctly from the target language model distributions. To address this limitation, we consider the problem of designing text samplers that are faithful, meaning that they have the target text distribution as its limiting distribution. We propose several faithful gradient-based sampling algorithms to sample from the target energy-based text distribution correctly, and study their theoretical properties. Through experiments on various forms of text generation, we demonstrate that faithful samplers are able to generate more fluent text while adhering to the control objectives better.
翻译:近期研究通过调整梯度采样算法(一种承诺快速收敛的MCMC算法范式)展示了基于能量的文本生成的可行性。然而,正如本文所示,先前基于该方法的文本生成尝试均未能正确从目标语言模型分布中采样。为解决这一局限,我们研究了设计具有保真性的文本采样器问题——即确保目标文本分布作为其极限分布。我们提出了若干具有保真性的梯度采样算法,以正确从目标基于能量的文本分布中采样,并分析了其理论性质。通过多种文本生成任务的实验,我们证明保真采样器能够在更好遵循控制目标的同时生成更流畅的文本。