In this article, we propose a class of $L_q$-norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between $L_q$-norm based U-statistics for several $q$s under mild moment and cumulant conditions. A simple combination of two studentized $L_q$-based test statistics via their $p$-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al. (2021), which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of $L_q$-norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from $O(n^{qr})$, which is required for the calculation of the full U-statistic, to $O(n^r)$ where $r$ is the order of the kernel. Numerical studies further corroborate the advantage of the proposed adaptive test as compared to some existing competitors.
翻译:本文提出了一类基于$L_q$范数的U统计量,用于处理与高维数据相关的全局检验问题家族。这包括均值向量及其空间符号检验、线性模型系数的联合检验、以及高维观测分量独立性的检验等。在零假设下,我们在温和的矩和累积量条件下推导了多个$q$值对应的基于$L_q$范数U统计量的渐近正态性和独立性。通过其p值简单组合两个学生化$L_q$检验统计量,该方法在应对不同稀疏性的备择假设时展现出强大功效。本研究是He等人(2021)工作的实质性拓展——该工作主要关注均值与协方差检验,而本文实现了对基于$L_q$范数U统计量渐近独立性的通用处理,适用于广泛核函数类。为缓解计算负担,我们引入一种通过单调索引求和实现的U统计量变体,得到具有非对称核的U统计量。通过动态规划方法,将完整U统计量所需的$O(n^{qr})$计算复杂度降至$O(n^r)$(其中$r$为核函数阶数)。数值研究进一步验证了所提自适应检验相较于现有竞争方法的优势。