Despite the success of autoregressive large language models in text generation, it remains a major challenge to generate text that satisfies complex constraints: sampling from the conditional distribution $\Pr(\text{text} | \alpha)$ is intractable for even the simplest lexical constraints $\alpha$. To overcome this challenge, we propose to use tractable probabilistic models to impose lexical constraints in autoregressive text generation, which we refer to as GeLaTo. To demonstrate the effectiveness of this framework, we use distilled hidden Markov models to control autoregressive generation from GPT2. GeLaTo achieves state-of-the-art performance on CommonGen, a challenging benchmark for constrained text generation, beating a wide range of strong baselines by a large margin. Our work not only opens up new avenues for controlling large language models but also motivates the development of more expressive tractable probabilistic models.
翻译:尽管自回归大型语言模型在文本生成方面取得了成功,但生成满足复杂约束的文本仍是一个重大挑战:即使对于最简单的词汇约束α,从条件分布$\Pr(\text{text} | \alpha)$中采样也是难以处理的。为克服这一挑战,我们提出使用可处理的概率模型在自回归文本生成中施加词汇约束,即GeLaTo框架。为验证该框架的有效性,我们利用蒸馏隐马尔可夫模型控制GPT2的自回归生成过程。在约束文本生成的挑战性基准CommonGen上,GeLaTo实现了最先进的性能,以大幅优势超越了多种强基线方法。我们的工作不仅为控制大型语言模型开辟了新途径,也推动了更具表达力的可处理概率模型的发展。