We propose new statistical tests, in high-dimensional settings, for testing the independence of two random vectors and their conditional independence given a third random vector. The key idea is simple, i.e., we first transform each component variable to standard normal via its marginal empirical distribution, and we then test for independence and conditional independence of the transformed random vectors using appropriate $L_\infty$-type test statistics. While we are testing some necessary conditions of the independence or the conditional independence, the new tests outperform the 13 frequently used testing methods in a large scale simulation comparison. The advantage of the new tests can be summarized as follows: (i) they do not require any moment conditions, (ii) they allow arbitrary dependence structures of the components among the random vectors, and (iii) they allow the dimensions of random vectors diverge at the exponential rates of the sample size. The critical values of the proposed tests are determined by a computationally efficient multiplier bootstrap procedure. Theoretical analysis shows that the sizes of the proposed tests can be well controlled by the nominal significance level, and the proposed tests are also consistent under certain local alternatives. The finite sample performance of the new tests is illustrated via extensive simulation studies and a real data application.
翻译:本文提出了一种在高维情形下检验两个随机向量之间独立性及其在给定第三个随机向量时条件独立性的新统计检验方法。其核心思想简明:首先通过各分量变量的边际经验分布将每个分量变量转换为标准正态变量,随后利用适当的$L_\infty$型检验统计量对变换后随机向量的独立性及条件独立性进行检验。尽管我们检验的是独立性或条件独立性的某些必要条件,但大规模模拟比较表明,新检验方法在性能上优于13种常用检验方法。新检验的优势可概括如下:(1)不要求任何矩条件;(2)允许随机向量内分量间存在任意依赖结构;(3)允许随机向量维度以样本量的指数速率发散。所提检验的临界值通过计算高效的多重自助法程序确定。理论分析表明,所提检验的显著性水平可由名义显著性水平有效控制,且在特定局部备择假设下具有一致性。通过大量模拟研究和实际数据应用,展示了新检验在有限样本下的性能表现。