We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionality for intrinsically high-dimensional data. This limitation is pronounced in graphical models such as Markov random fields, common for image distributions, where the approximation efficiency of score functions remains unestablished. To address this, we observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms. Furthermore, these algorithms are amenable to efficient neural network representation. We demonstrate this in examples of graphical models, including Ising models, conditional Ising models, restricted Boltzmann machines, and sparse encoding models. Combined with off-the-shelf discretization error bounds for diffusion-based sampling, we provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.
翻译:我们研究扩散生成建模中深度神经网络对评分函数的近似效率。现有近似理论利用评分函数的平滑性,但对本质高维数据存在维度灾难问题。这一局限在图模型(如马尔可夫随机场,常见于图像分布)中尤为突出,此类模型中评分函数的近似效率尚未建立。为解决该问题,我们观察到通过变分推断去噪算法,图模型中的评分函数通常能被良好近似。此外,这些算法适用于高效的神经网络表示。我们在图模型实例中证明了这一点,包括伊辛模型、条件伊辛模型、受限玻尔兹曼机和稀疏编码模型。结合扩散抽样中现成的离散化误差界,我们为深度神经网络学习评分函数下的扩散生成建模提供了高效的样本复杂度界。