Maximum mean discrepancy (MMD) refers to a general class of nonparametric two-sample tests that are based on maximizing the mean difference over samples from one distribution $P$ versus another $Q$, over all choices of data transformations $f$ living in some function space $\mathcal{F}$. Inspired by recent work that connects what are known as functions of $\textit{Radon bounded variation}$ (RBV) and neural networks (Parhi and Nowak, 2021, 2023), we study the MMD defined by taking $\mathcal{F}$ to be the unit ball in the RBV space of a given smoothness order $k \geq 0$. This test, which we refer to as the $\textit{Radon-Kolmogorov-Smirnov}$ (RKS) test, can be viewed as a generalization of the well-known and classical Kolmogorov-Smirnov (KS) test to multiple dimensions and higher orders of smoothness. It is also intimately connected to neural networks: we prove that the witness in the RKS test -- the function $f$ achieving the maximum mean difference -- is always a ridge spline of degree $k$, i.e., a single neuron in a neural network. This allows us to leverage the power of modern deep learning toolkits to (approximately) optimize the criterion that underlies the RKS test. We prove that the RKS test has asymptotically full power at distinguishing any distinct pair $P \not= Q$ of distributions, derive its asymptotic null distribution, and carry out extensive experiments to elucidate the strengths and weakenesses of the RKS test versus the more traditional kernel MMD test.
翻译:最大均值差异(MMD)指一类通用的非参数两样本检验方法,其核心思想是在某个函数空间 $\mathcal{F}$ 中所有数据变换 $f$ 的选项上,最大化来自分布 $P$ 与分布 $Q$ 的样本均值差异。受近期关于Radon有界变差(RBV)函数与神经网络之间联系的研究启发(Parhi and Nowak, 2021, 2023),我们研究了将 $\mathcal{F}$ 定义为给定光滑阶数 $k \geq 0$ 的RBV空间单位球时的MMD检验。该检验被称为Radon-Kolmogorov-Smirnov(RKS)检验,可视为经典Kolmogorov-Smirnov(KS)检验向多维空间及更高光滑阶数的推广。RKS检验与神经网络存在紧密关联:我们证明其最优判别函数(即实现最大均值差异的 $f$)始终是 $k$ 阶脊样条函数,相当于神经网络中的单个神经元。这一发现使我们能够借助现代深度学习工具(近似)优化RKS检验的核心准则。我们证明了RKS检验在区分任意不同分布对 $P \neq Q$ 时具有渐近完全功效,推导了其渐近零分布,并通过大量实验阐明了RKS检验相对于传统核MMD检验的优势与局限。