Diffusion Large Language Models (dLLMs) have demonstrated promising generative capabilities and are increasingly used to produce formal languages defined by context-free grammars, such as source code and chemical expressions. However, as probabilistic models, they still struggle to generate syntactically valid outputs reliably. A natural and promising direction to address this issue is to adapt constrained decoding techniques to enforce grammatical correctness during generation. However, applying these techniques faces two primary obstacles. On the one hand, the non-autoregressive nature of dLLMs renders most existing constrained decoding approaches inapplicable. On the other hand, current approaches specifically designed for dLLMs may allow intermediate outputs that are impossible to complete into valid sentences, which significantly limits their reliability in practice. To address these challenges, we present LAVE, a constrained decoding approach specifically designed for dLLMs. Our approach leverages a key property of dLLMs, namely their ability to predict token distributions for all positions in parallel during each forward pass. Whenever a new token is proposed by model, LAVE performs lookahead using these distributions to efficiently and reliably verify the validity of the proposed token. This design ensures reliable constraints by reliably preserving the potential for intermediate outputs to be extended into valid sentences. Extensive experiments across four widely used dLLMs and three representative benchmarks demonstrate that LAVE consistently outperforms existing baselines and achieves substantial improvements in syntactic correctness, while incurring negligible runtime overhead.
翻译:扩散大语言模型(dLLMs)已展现出良好的生成能力,并越来越多地用于生成由上下文无关文法定义的形式语言,例如源代码和化学表达式。然而,作为概率模型,它们仍难以可靠地生成语法有效的输出。解决此问题的一个自然且有前景的方向是采用约束解码技术,在生成过程中强制保证语法正确性。然而,应用这些技术面临两个主要障碍。一方面,dLLMs的非自回归特性使得大多数现有约束解码方法无法适用。另一方面,当前专门为dLLMs设计的方法可能允许中间输出无法被补全为有效句子,这显著限制了其在实际应用中的可靠性。为应对这些挑战,我们提出了LAVE,一种专为dLLMs设计的约束解码方法。我们的方法利用了dLLMs的一个关键特性,即其在每次前向传播中能够并行预测所有位置的标记分布。每当模型提出一个新标记时,LAVE利用这些分布进行前瞻,以高效可靠地验证所提标记的有效性。该设计通过可靠保持中间输出可扩展为有效句子的潜力,确保了约束的可靠性。在四种广泛使用的dLLMs和三个代表性基准测试上进行的大量实验表明,LAVE始终优于现有基线方法,在语法正确性方面实现了显著提升,同时仅带来可忽略的运行时间开销。