Denoising diffusion models have spurred significant gains in density modeling and image generation, precipitating an industrial revolution in text-guided AI art generation. We introduce a new mathematical foundation for diffusion models inspired by classic results in information theory that connect Information with Minimum Mean Square Error regression, the so-called I-MMSE relations. We generalize the I-MMSE relations to exactly relate the data distribution to an optimal denoising regression problem, leading to an elegant refinement of existing diffusion bounds. This new insight leads to several improvements for probability distribution estimation, including theoretical justification for diffusion model ensembling. Remarkably, our framework shows how continuous and discrete probabilities can be learned with the same regression objective, avoiding domain-specific generative models used in variational methods. Code to reproduce experiments is provided at http://github.com/kxh001/ITdiffusion and simplified demonstration code is at http://github.com/gregversteeg/InfoDiffusionSimple.
翻译:去噪扩散模型在密度建模和图像生成方面取得了显著进展,推动了文本引导的人工智能艺术生成的工业革命。我们受信息论中经典结果(即信息与最小均方误差回归之间的关系,即I-MMSE关系)的启发,为扩散模型引入了一个新的数学基础。我们将I-MMSE关系推广,精确地将数据分布与最优去噪回归问题联系起来,从而优雅地改进了现有的扩散界限。这一新见解为概率分布估计带来了若干改进,包括为扩散模型集成提供了理论依据。值得注意的是,我们的框架展示了如何通过相同的回归目标学习连续和离散概率,从而避免了变分方法中使用的特定领域生成模型。用于复现实验的代码见http://github.com/kxh001/ITdiffusion,简化演示代码见http://github.com/gregversteeg/InfoDiffusionSimple。