Despite the success of autoregressive large language models in text generation, it remains a major challenge to generate text that satisfies complex constraints: sampling from the conditional distribution ${\Pr}(\text{text} | \alpha)$ is intractable for even the simplest lexical constraints $\alpha$. To overcome this challenge, we propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models, which we refer to as GeLaTo (Generating Language with Tractable Constraints). To demonstrate the effectiveness of this framework, we use distilled hidden Markov models, where we can efficiently compute ${\Pr}(\text{text} | \alpha)$, to guide autoregressive generation from GPT2. GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation (e.g., CommonGen), beating various strong baselines by a large margin. Our work not only opens up new avenues for controlling large language models but also motivates the development of more expressive TPMs.
翻译:尽管自回归大语言模型在文本生成中取得了成功,但生成满足复杂约束的文本仍是一个重大挑战:即使对于最简单的词汇约束$\alpha$,从条件分布${\Pr}(\text{text} | \alpha)$中采样也是难以处理的。为克服这一难题,我们提出利用可处理概率模型(TPMs)在自回归文本生成模型中施加词汇约束,并将其称为GeLaTo(带可处理约束的语言生成框架)。为验证该框架的有效性,我们采用可高效计算${\Pr}(\text{text} | \alpha)$的蒸馏隐马尔可夫模型来指导GPT2的自回归生成过程。在约束文本生成的挑战性基准测试(如CommonGen)中,GeLaTo以显著优势超越了多种强基线方法,取得了最先进性能。本研究不仅为控制大型语言模型开辟了新途径,也推动了更具表达力的TPMs的发展。