We consider the problem of generative modeling based on smoothing an unknown density of interest in $\mathbb{R}^d$ using factorial kernels with $M$ independent Gaussian channels with equal noise levels introduced by Saremi and Srivastava (2022). First, we fully characterize the time complexity of learning the resulting smoothed density in $\mathbb{R}^{Md}$, called M-density, by deriving a universal form for its parametrization in which the score function is by construction permutation equivariant. Next, we study the time complexity of sampling an M-density by analyzing its condition number for Gaussian distributions. This spectral analysis gives a geometric insight on the "shape" of M-densities as one increases $M$. Finally, we present results on the sample quality in this class of generative models on the CIFAR-10 dataset where we report Fr\'echet inception distances (14.15), notably obtained with a single noise level on long-run fast-mixing MCMC chains.
翻译:我们考虑基于使用因子核进行平滑处理、在$\mathbb{R}^d$中通过Saremi和Srivastava(2022)提出的$M$个噪声水平相等的高斯独立通道来平滑未知密度的生成建模问题。首先,我们通过推导其参数化的通用形式(其中得分函数通过构造具有排列等变性),完整刻画了在$\mathbb{R}^{Md}$中学习所得平滑密度(称为M密度)的时间复杂度。其次,我们通过分析高斯分布的条件数,研究了采样M密度的时间复杂度。这一谱分析为随着$M$增大时M密度的"形状"提供了几何学洞见。最后,我们展示了在CIFAR-10数据集上此类生成模型的样本质量结果,其中报告了弗雷歇初始距离(14.15),值得注意的是该结果是在长程快速混合MCMC链上使用单一噪声水平获得的。