This paper introduces and investigates the utilization of maximum and average distance correlations for multivariate independence testing. We characterize their consistency properties in high-dimensional settings with respect to the number of marginally dependent dimensions, assess the advantages of each test statistic, examine their respective null distributions, and present a fast chi-square-based testing procedure. The resulting tests are non-parametric and applicable to both Euclidean distance and the Gaussian kernel as the underlying metric. To better understand the practical use cases of the proposed tests, we evaluate the empirical performance of the maximum distance correlation, average distance correlation, and the original distance correlation across various multivariate dependence scenarios, as well as conduct a real data experiment to test the presence of various cancer types and peptide levels in human plasma.
翻译:本文介绍并研究了利用最大和平均距离相关性进行多元独立性检验的方法。我们刻画了这些统计量在高维环境下相对于边际相关维度的相合性性质,评估了各检验统计量的优势,考察了其各自的零分布,并提出了一种基于卡方分布的快速检验程序。所得检验为非参数方法,可适用于欧氏距离和高斯核作为基础度量。为更好地理解所提检验的实际应用场景,我们评估了最大距离相关性、平均距离相关性以及原始距离相关性在多种多元依赖情况下的经验性能,并通过真实数据实验检验了人类血浆中多种癌症类型与肽段水平的相关性。