Sampling from Flow Language Models via Marginal-Conditioned Bridges

Flow Language Models (FLMs) are a recently introduced class of language models which adapt continuous flow matching for one-hot encoded token sequences. Their denoisers have a special structure absent from generic continuous diffusion models: each block of the denoising mean is a posterior marginal distribution over the clean token at that position. Standard DDPM-style samplers collapse these marginals to a single conditional-mean endpoint and bridge toward this simplex-valued point, which is generally not a valid one-hot sequence. We argue that the natural sampler for an FLM is instead posterior-predictive. At each reverse step, we sample a clean one-hot endpoint from the factorized posterior defined by the FLM token marginals, and then sample the next continuous state from the analytic Ornstein--Uhlenbeck bridge conditioned on that endpoint. The method is training-free, uses the same model evaluations as standard sampling, and gives a principled interface for token-level decoding controls such as temperature scaling and nucleus truncation. We show that, under exact posterior marginals, the endpoint approximation error is exactly the conditional multi-information among token positions. The induced one-step bridge kernel preserves all token-wise posterior-predictive marginals and loses only the residual cross-position dependence. Finally, we prove a Girsanov path-space comparison showing that the marginal-conditioned bridge has a no-larger denoising-error term than the frozen conditional-mean bridge, with strict improvement whenever intermediate coordinate-wise bridge observations reveal additional information about the clean token. Experiments with FLMs show that the sampler improves the quality--diversity tradeoff. Code is available at: github.com/imbirik/mcb.

翻译：流语言模型(FLMs)是一类新近提出的语言模型，它将连续流匹配方法适配到独热编码的标记序列上。其去噪器具有通用连续扩散模型所不具备的特殊结构：去噪均值中的每个块都是该位置干净标记的后验边缘分布。标准DDPM式采样器会将这些边缘分布坍缩为单一的条件均值端点，并沿此单纯形值点构建桥接，但该端点通常并非有效的独热序列。我们认为FLM的自然采样器应是后验预测式的。在每一步反向过程中，我们首先从FLM标记边缘分布所定义的分解后验中采样一个干净的独热端点，然后以该端点为条件，从解析的奥恩斯坦-乌伦贝克桥中采样下一个连续状态。该方法无需训练，使用与标准采样相同的模型评估次数，并为温度缩放、核采样截断等标记级解码控制提供了原理性接口。我们证明，在精确后验边缘分布下，端点近似误差恰好等于各标记位置间的条件互信息。所导出的单步桥核保留了所有标记维度的后验预测边缘分布，仅损失了残余的跨位置依赖性。最后，我们通过Girsanov路径空间比较证明，相较于冻结的条件均值桥，边际条件桥具有更小的去噪误差项，且当中间坐标式桥观测值揭示更多关于干净标记的信息时，该改进严格成立。FLM实验表明，该采样器优化了质量-多样性权衡。代码开源于：github.com/imbirik/mcb。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

专知会员服务

15+阅读 · 2025年10月18日

扩散语言模型综述

专知会员服务

19+阅读 · 2025年8月15日

【牛津大学博士论文】构建具有一致性预测的可信语言模型

专知会员服务

17+阅读 · 2025年4月24日

《口语语言模型研究现状：一项全面综述》

专知会员服务

16+阅读 · 2025年4月14日