Amortizing intractable inference in large language models

Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address this limitation by using amortized Bayesian inference to sample from these intractable posteriors. Such amortization is algorithmically achieved by fine-tuning LLMs via diversity-seeking reinforcement learning algorithms: generative flow networks (GFlowNets). We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training and reward-maximizing policy optimization. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem and demonstrate that our approach enables data-efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use.

翻译：自回归大型语言模型通过下一个词元的条件分布压缩其训练数据中的知识。这限制了该知识的可处理查询方式，仅支持从起始到结束的自回归采样。然而，许多感兴趣的任务（包括序列续写、填充及其他形式的受控生成）涉及从难以处理的后验分布中采样。我们通过使用摊销贝叶斯推理从这些难以处理的后验分布中采样来应对这一限制。这种摊销在算法上通过使用追求多样性的强化学习算法——生成流网络（GFlowNets）——对大型语言模型进行微调来实现。我们实证证明，这种分布匹配范式的大型语言模型微调可以作为最大似然训练和奖励最大化策略优化的有效替代方案。作为一项重要应用，我们将链式思维推理解释为潜在变量建模问题，并证明我们的方法能够以数据高效的方式使大型语言模型适应需要多步骤推理和工具使用的任务。

相关内容

大语言模型

关注 67

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

大语言模型简明指南

专知会员服务

143+阅读 · 2023年7月29日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日