Maximum mean discrepancy (MMD) flows suffer from high computational costs in large scale computations. In this paper, we show that MMD flows with Riesz kernels $K(x,y) = - \|x-y\|^r$, $r \in (0,2)$ have exceptional properties which allow for their efficient computation. First, the MMD of Riesz kernels coincides with the MMD of their sliced version. As a consequence, the computation of gradients of MMDs can be performed in the one-dimensional setting. Here, for $r=1$, a simple sorting algorithm can be applied to reduce the complexity from $O(MN+N^2)$ to $O((M+N)\log(M+N))$ for two empirical measures with $M$ and $N$ support points. For the implementations we approximate the gradient of the sliced MMD by using only a finite number $P$ of slices. We show that the resulting error has complexity $O(\sqrt{d/P})$, where $d$ is the data dimension. These results enable us to train generative models by approximating MMD gradient flows by neural networks even for large scale applications. We demonstrate the efficiency of our model by image generation on MNIST, FashionMNIST and CIFAR10.
翻译:最大均值差异(MMD)流在大规模计算中面临高计算成本。本文证明,具有Riesz核 $K(x,y) = - \|x-y\|^r$, $r \in (0,2)$ 的MMD流具有显著特性,可实现高效计算。首先,Riesz核的MMD与其切片版本的MMD完全等价。因此,MMD梯度的计算可在一维场景中完成。当 $r=1$ 时,针对支持点数分别为$M$和$N$的两个经验测度,简单排序算法可将复杂度从 $O(MN+N^2)$ 降低至 $O((M+N)\log(M+N))$。实现中,我们仅通过有限个切片($P$ 片)近似切片MMD的梯度,并证明了该近似的误差复杂度为 $O(\sqrt{d/P})$(其中 $d$ 为数据维度)。这些结果使我们在大规模应用中也可通过神经网络近似MMD梯度流来训练生成模型。我们在MNIST、FashionMNIST和CIFAR10上的图像生成实验验证了该模型的高效性。