Standard kernel two-sample tests, such as those based on the Maximum Mean Discrepancy (MMD), aggregate squared differences across all directions in a Reproducing Kernel Hilbert Space (RKHS). However, in finite samples, trailing directional components are noisy, which degrades test power. We propose a novel kernel-based test that resolves this by truncating the spectral decomposition of the MMD, retaining only the well-estimated leading eigen-directions. By aggregating these robust components, our method achieves superior power and robustness, particularly in high-dimensional and unbalanced settings. Furthermore, we introduce a computationally efficient parametric bootstrap procedure for approximating critical values, which is theoretically justified and significantly faster than permutation-based alternatives. Extensive simulations and empirical studies demonstrate that our method maintains strict Type I error control while delivering higher power than existing MMD-based tests.
翻译:标准核双样本检验(例如基于最大均值差异的方法)通过聚合再生核希尔伯特空间中所有方向上的平方差异进行检验。然而在有限样本条件下,尾部方向分量存在噪声,这削弱了检验效能。我们提出一种新型核检验方法,通过对MMD的谱分解进行截断,仅保留估计稳健的前导特征方向来消除该问题。通过聚合这些稳健分量,我们的方法在维度失衡与非均衡设定下均能实现更优的检验效能与鲁棒性。此外,我们引入计算高效的参数化自助法程序来近似临界值,该程序具有理论保障且显著优于基于置换的替代方案。大量模拟与实证研究表明,该方法在保持严格的第一类错误控制的同时,相较于现有MMD检验方法具有更高的检验效能。