In this article, we propose a two-sample test for functional observations modeled as elements of a separable Hilbert space. We present a general recipe for constructing a measure of dissimilarity between the distributions of two Hilbertian random variables and study the theoretical properties of one such measure which is constructed using Maximum Mean Discrepancy (MMD) on random linear projections of the distributions and aggregating them. We propose a data-driven estimate of this measure and use it as the test statistic. Large sample distributions of this statistic are derived both under null and alternative hypotheses. This test statistic involves a kernel function and the associated bandwidth. We prove that the resulting test has large-sample consistency for any data-driven choice of bandwidth that converges in probability to a positive number. Since the theoretical quantiles of the limiting null distribution are intractable, in practice, the test is calibrated using the permutation method. We also derive the limiting distribution of the permuted test statistic and the asymptotic power of the permutation test under local contiguous alternatives. This shows that the permutation test is consistent and statistically efficient in the Pitman sense. Extensive simulation studies are carried out and a real data set is analyzed to compare the performance of our proposed test with some state-of-the-art methods.
翻译:本文针对可分离希尔伯特空间中的函数观测数据,提出了一种双样本检验方法。我们构建了用于度量两个希尔伯特随机变量分布差异的通用框架,重点研究了基于最大均值差异(MMD)在随机线性投影分布上的聚合度量及其理论性质。通过数据驱动方式估计该度量并作为检验统计量,推导了该统计量在原假设和备择假设下的渐近分布。该检验统计量包含核函数及相关带宽参数,我们证明了对于任意依概率收敛至正数的数据驱动带宽选择,所提出检验具有大样本一致性。鉴于原假设极限分布的理论分位数难以计算,实际应用中采用置换方法进行校准。同时推导了置换检验统计量的极限分布,以及在局部邻接备择假设下置换检验的渐近势,证实该置换检验具有Pitman意义下的一致性和统计有效性。通过大量模拟研究与真实数据集分析,将所提方法与前沿方法进行了性能对比。