Diffusion Language models (DLMs) are a promising avenue for text generation due to their practical properties on tractable controllable generation. They also have the advantage of not having to predict text autoregressively. However, despite these notable features, DLMs have not yet reached the performance levels of their Autoregressive counterparts. One of the ways to reduce the performance gap between these two types of language models is to speed up the generation of DLMs. Therefore, we propose a pioneering methodology to address this issue in this work. It enables the execution of more generation steps within a given time frame, potentially leading to higher-quality outputs. Specifically, our methods estimate DLMs completeness of text generation and allow adaptive halting of the generation process. We test and refine our methods on Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation workflows. Finally, we confirm that our methods allow halting Plaid, SSD, and CDCD models and decrease the generation time by $10$-$40$% without a drop in the quality of model samples.
翻译:扩散语言模型(DLMs)因其在可控文本生成方面的实用特性而成为一种有前景的文本生成方法,并且无需像自回归模型那样逐词预测。然而,尽管具有这些显著优势,DLMs的性能仍未达到自回归语言模型的水平。缩小这两类语言模型性能差距的方法之一是加速DLMs的生成过程。因此,本文提出了一种开创性的方法论来解决这一问题。该方法能够在给定时间框架内执行更多生成步骤,从而可能获得更高质量的输出。具体而言,我们的方法通过估计DLMs文本生成的完成度,并允许生成过程自适应地提前停止。我们在Plaid、SSD和CDCD这三种DLM上测试并优化了所提出的方法,形成了对其生成流程的统一视角。最后,我们证实该方法能够在保证样本质量不降低的前提下,使Plaid、SSD和CDCD模型的生成时间减少10%-40%。