Diffusion Language models (DLMs) are a promising avenue for text generation due to their practical properties on tractable controllable generation. They also have the advantage of not having to predict text autoregressively. However, despite these notable features, DLMs have not yet reached the performance levels of their autoregressive counterparts. One of the ways to reduce the performance gap between these two types of language models is to speed up the generation of DLMs. Therefore, we propose a novel methodology to address this issue in this work. It enables the execution of more generation steps within a given time frame, leading to higher-quality outputs. Specifically, our methods estimate DLMs completeness of text generation and allow adaptive halting of the generation process. We evaluate our methods on Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation workflows. Finally, we confirm that our methods allow halting these models and decrease the generation time by $10$-$40$\% without a drop in the quality of model samples.
翻译:扩散语言模型(DLMs)因其在可控生成方面的实用特性而成为文本生成的一个有前景的方向。它们还具有无需自回归预测文本的优势。然而,尽管具备这些显著特点,DLMs的性能尚未达到自回归语言模型的水平。缩小这两类语言模型性能差距的方法之一是加速DLMs的生成过程。因此,本文提出了一种新颖的方法来解决这一问题。该方法允许在给定时间范围内执行更多生成步骤,从而产生更高质量的输出。具体而言,我们的方法估计DLMs文本生成的完整度,并允许自适应地停止生成过程。我们在Plaid、SSD和CDCD三种DLM上评估了该方法,并对其生成工作流建立了统一的视角。最后,我们证实该方法能够使这些模型提前停止,并将生成时间降低10%-40%,同时不降低模型样本的质量。