The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches for calibration. In this work we leverage a martingale interpretation of the estimated squared MMD to propose martingale MMD (mMMD), a quadratic-time statistic which has a limiting standard Gaussian distribution under the null. Moreover we show that the test is consistent against any fixed alternative and for large sample sizes, mMMD offers substantial computational savings over the standard MMD test, with only a minor loss in power.
翻译:最大均值差异(MMD)是双样本检验中广泛使用的多元距离度量。标准MMD检验统计量的零分布难以处理,通常需要昂贵的重采样或置换方法进行校准。本研究利用估计平方MMD的鞅解释,提出鞅MMD(mMMD)——一种具有二次时间复杂度的统计量,其在零假设下具有渐近标准高斯分布。此外,我们证明该检验对任何固定备择假设具有一致性,并且在大样本量下,mMMD相较于标准MMD检验能显著节省计算成本,而功效损失微乎其微。