Recent Uniform State Diffusion Models (USDMs), initialized from a uniform prior, offer the promise of fast text generation due to their inherent self-correction ability compared to masked diffusion models. However, they still rely on complex loss formulations with additional computational overhead, which hinders scalability. In this work, we explore a simplified denoising-based loss for USDMs that optimizes only noise-replaced tokens, stabilizing training while matching the performance of prior methods with more complex objectives. In addition, we introduce an efficient regularization term to mitigate corruption toward uniform output distributions, which further improves performance. We demonstrate the effectiveness and efficiency of our simple and improved loss formulations by pretraining models on widely used text datasets for USDMs. More importantly, our conclusions scale to larger models, showing strong potential for large-scale training.
翻译:最近提出的均匀状态扩散模型(USDMs)从均匀先验初始化,相比掩码扩散模型因其固有的自校正能力而展现出快速文本生成的潜力。然而,它们仍依赖于复杂的损失函数形式,并伴随额外的计算开销,这阻碍了其可扩展性。在本工作中,我们为USDMs探索了一种简化的基于去噪的损失函数,该函数仅优化被噪声替换的标记,在稳定训练的同时,达到了与先前采用更复杂目标的方法相当的性能。此外,我们引入了一种高效的规范化项,以减轻向均匀输出分布的退化,从而进一步提升了性能。通过在USDMs广泛使用的文本数据集上进行模型预训练,我们证明了我们这种简单且改进的损失函数形式的有效性和效率。更重要的是,我们的结论可扩展至更大规模的模型,显示出大规模训练的强劲潜力。