Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. Besides the quest for generating images at ever higher resolution, our primary motivation is to create a well-posed infinite-dimensional learning problem so that we can discretize it consistently on multiple resolution levels. We thereby intend to obtain diffusion models that generalize across different resolution levels and improve the efficiency of the training process. We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting. First, we modify the forward process to ensure that the latent distribution is well-defined in the infinite-dimensional setting using the notion of trace class operators. We derive the reverse processes for finite approximations. Second, we illustrate that approximating the score function with an operator network is beneficial for multilevel training. After deriving the convergence of the discretization and the approximation of multilevel training, we implement an infinite-dimensional SBDM approach and show the first promising results on MNIST and Fashion-MNIST, underlining our developed theory.
翻译:分数基扩散模型(SBDM)近期已成为图像生成领域最先进的方法。现有SBDM通常构建于有限维框架下,将图像视为有限大小的张量。本文在无限维框架下发展SBDM,即将训练数据建模为定义在矩形域上的函数。除了追求生成更高分辨率图像外,我们的核心动机是构建一个良态的无限维学习问题,从而能够在多个分辨率层级上对其实现一致离散化。由此,我们旨在获得能跨不同分辨率层级泛化、并提升训练效率的扩散模型。我们展示了如何在无限维框架下克服当前SBDM方法的两个缺陷。首先,我们修改前向过程,利用迹类算子的概念确保潜在分布在无限维框架下良定义,并推导出有限逼近的逆过程。其次,我们论证采用算子网络近似分数函数对多层级训练具有优势。在推导离散化的收敛性及多层级训练的逼近性质后,我们实现了无限维SBDM方法,并在MNIST和Fashion-MNIST数据集上展示了首批令人鼓舞的结果,验证了所提出理论的正确性。