Generative models that maximize model likelihood have gained traction in many practical settings. Among them, perturbation based approaches underpin many strong likelihood estimation models, yet they often face slow convergence and limited theoretical understanding. In this paper, we derive a tighter likelihood bound for noise driven models to improve both the accuracy and efficiency of maximum likelihood learning. Our key insight extends the classical KL divergence Fisher information relationship to arbitrary noise perturbations, going beyond the Gaussian assumption and enabling structured noise distributions. This formulation allows flexible use of randomized noise distributions that naturally account for sensor artifacts, quantization effects, and data distribution smoothing, while remaining compatible with standard diffusion training. Treating the diffusion process as a Gaussian channel, we further express the mismatched entropy between data and model, showing that the proposed objective upper bounds the negative log-likelihood (NLL). In experiments, our models achieve competitive NLL on CIFAR-10 and SOTA results on ImageNet across multiple resolutions, all without data augmentation, and the framework extends naturally to discrete data.
翻译:在许多实际应用中,最大化模型似然的生成模型已获得广泛关注。其中,基于扰动的学习方法构成了许多强似然估计模型的基础,但这些方法常面临收敛速度慢且理论理解有限的问题。本文针对噪声驱动模型推导出更紧致的似然界,以提升最大似然学习的准确性与效率。我们的核心洞见在于将经典的KL散度-费希尔信息关系推广至任意噪声扰动,突破了高斯假设的限制,使其能够处理结构化噪声分布。该公式允许灵活使用随机化噪声分布,这些分布能够自然表征传感器伪影、量化效应及数据分布平滑现象,同时保持与标准扩散训练的兼容性。通过将扩散过程视为高斯信道,我们进一步表达了数据与模型之间的失配熵,证明了所提出的目标函数是负对数似然(NLL)的上界。在实验中,我们的模型在CIFAR-10数据集上取得了具有竞争力的NLL结果,在ImageNet多个分辨率上均获得了当前最优性能,且均未使用数据增强技术。该框架还可自然地扩展至离散数据场景。