Diffusion Large Language Models (dLLMs) offer a promising avenue for parallel generation but face a trade-off between decoding speed and quality. While revocable decoding strategies attempt to mitigate errors by verifying and remasking tokens, they typically operate within a mixed-quality context. This leads to two critical failures: \textit{Error Propagation}, where new tokens absorb toxic information from erroneous context, and \textit{Local Error Reinforcement}, where errors mutually reinforce each other to evade detection. To alleviate these challenges, we propose ASRD (Anchor Supervised Revocable Decoding), a training-free framework that operates within the embedding space. ASRD explicitly decouples the decoding context into trusted \textit{Anchor Tokens}, which are identified via temporal consistency, and uncertain candidates. Leveraging a dynamic Anchor Tokens Cache, we introduce two complementary mechanisms: (1) Anchor-Guided Generation, which injects entropy-weighted anchor signals into masked positions to implicitly rectify attention toward the reliable global skeleton; and (2) Anchor-Perturbed Verification, which applies orthogonal perturbations to uncertain candidate tokens, destabilizing and remasking errors driven by fragile local consensus. Extensive experiments on math and coding benchmarks demonstrate that ASRD outperforms recent remasking baselines, achieving accuracy improvements of up to 6.4\% while accelerating inference throughput by up to 7.2$\times$.
翻译:扩散大语言模型(dLLMs)为并行生成提供了有前景的路径,但面临解码速度与质量之间的权衡。尽管可撤销解码策略试图通过验证和重新掩码来缓解错误,但此类方法通常运行在混合质量上下文中,导致两个关键失效模式:**错误传播**(新令牌吸收来自错误上下文的有害信息)和**局部错误强化**(错误相互增强以规避检测)。为解决这些挑战,我们提出ASRD(锚点监督可撤销解码)——一种在嵌入空间运行的免训练框架。ASRD将解码上下文显式解耦为通过时间一致性识别的可信**锚点令牌**与不确定候选令牌。基于动态锚点令牌缓存,我们引入两种互补机制:(1)**锚点引导生成**:将熵加权锚点信号注入掩码位置,隐式引导注意力朝向可靠全局骨架;(2)**锚点扰动验证**:对不确定候选令牌施加正交扰动,破坏并重新掩码由脆弱局部共识驱动的错误。在数学与代码基准上的大量实验表明,ASRD优于近期重新掩码基线,在推理吞吐量提升高达7.2倍的同时,准确率提升可达6.4%。