Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models

Masked diffusion language models (MDLMs) promise fast, non-autoregressive text generation, yet existing samplers, which pick tokens to unmask based on model confidence, ignore interactions when unmasking multiple positions in parallel and effectively reduce to slow, autoregressive behavior. We propose the Dilated Unmasking Scheduler (DUS), an inference-only, planner-model-free method that partitions sequence positions into non-adjacent dilated groups and unmasks them in parallel so as to minimize an upper bound on joint entropy gain at each denoising step. By explicitly trading off the number of network calls against generation quality, DUS recovers most of the performance lost under traditional parallel unmasking strategies. Across math (GSM8K, MATH500), code (HumanEval, MBPP), general-knowledge (BBH, MMLU-Pro), and instruction following (IFEval) benchmarks, DUS outperforms confidence-based planners and turns the diffusion-specific quality-speed trade-off into a deterministic, predictable speedup set by the block size $B$, yielding up to $5.8\times$ wall-clock speedup over token-by-token MDLM decoding without modifying the underlying denoiser. Applied as a drop-in post-filter, dilated spacing also improves adaptive samplers. Code is available at https://github.com/omerlux/DUS.

翻译：掩码扩散语言模型（MDLM）承诺实现快速、非自回归的文本生成，然而现有采样器根据模型置信度选择要解掩码的令牌，忽视了在并行解掩码多个位置时的交互作用，实际上退化为缓慢的自回归行为。我们提出膨胀解掩码调度器（DUS），这是一种仅推理、无需规划器（planner-model-free）的方法，它将序列位置划分为非相邻的膨胀组并并行解掩码，从而在每个去噪步骤中最小化联合熵增益的上界。通过明确地在网络调用次数与生成质量之间进行权衡，DUS 恢复了在传统并行解掩码策略下损失的大部分性能。在数学（GSM8K、MATH500）、代码（HumanEval、MBPP）、通用知识（BBH、MMLU-Pro）和指令遵循（IFEval）基准测试中，DUS 优于基于置信度的规划器，并将扩散特化的质量-速度权衡转化为由块大小 $B$ 确定的确定性、可预测的加速，在无需修改底层去噪器的情况下，相比逐令牌 MDLM 解码实现了高达 $5.8\times$ 的挂钟时间加速。作为即插即用的后滤波器，膨胀间距还能改进自适应采样器。代码可在 https://github.com/omerlux/DUS 获取。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

内省扩散语言模型

专知会员服务

13+阅读 · 4月14日

《理解大语言模型在军事战术任务规划中的局限性》

专知会员服务

53+阅读 · 2025年12月30日

【NeurIPS2025】基于卷积解码与拒斥式微调的快速流畅扩散语言模型

专知会员服务

12+阅读 · 2025年9月21日

扩散语言模型综述

专知会员服务

19+阅读 · 2025年8月15日