We use a suitable version of the so-called "kernel trick" to devise two-sample (homogeneity) tests, especially focussed on high-dimensional and functional data. Our proposal entails a simplification related to the important practical problem of selecting an appropriate kernel function. Specifically, we apply a uniform variant of the kernel trick which involves the supremum within a class of kernel-based distances. We obtain the asymptotic distribution (under the null and alternative hypotheses) of the test statistic. The proofs rely on empirical processes theory, combined with the delta method and Hadamard (directional) differentiability techniques, and functional Karhunen-Lo\`eve-type expansions of the underlying processes. This methodology has some advantages over other standard approaches in the literature. We also give some experimental insight into the performance of our proposal compared to the original kernel-based approach \cite{Gretton2007} and the test based on energy distances \cite{Szekely-Rizzo-2017}.
翻译:本文利用所谓的“核技巧”的适当版本,设计了针对两样本(齐性)检验的方法,特别关注高维和函数型数据。我们的方案简化了一个重要的实际问题——如何选择合适的核函数。具体而言,我们应用了核技巧的均匀变体,涉及一类基于核距离的范数上确界。我们获得了检验统计量在零假设和备择假设下的渐近分布。证明基于经验过程理论,结合增量方法和哈达玛(方向性)可微性技术,以及底层过程的函数型Karhunen-Loève型展开。该方法相比文献中其他标准方法具有一些优势。我们还通过实验对比了我们的方案与原始的基于核的方法(Gretton 2007)以及基于能量距离的检验(Szekely-Rizzo 2017)的性能。