In this paper, under the assumption that the dimension is much larger than the sample size, i.e., $p \asymp n^{\alpha}, \alpha>1,$ we consider the (unnormalized) sample covariance matrices $Q = \Sigma^{1/2} XX^*\Sigma^{1/2}$, where $X=(x_{ij})$ is a $p \times n$ random matrix with centered i.i.d entries whose variances are $(pn)^{-1/2}$, and $\Sigma$ is the deterministic population covariance matrix. We establish two classes of central limit theorems (CLTs) for the linear spectral statistics (LSS) for $Q,$ the global CLTs on the macroscopic scales and the local CLTs on the mesoscopic scales. We prove that the LSS converge to some Gaussian processes whose mean and covariance functions depending on $\Sigma$, the ratio $p/n$ and the test functions, can be identified explicitly on both macroscopic and mesoscopic scales. We also show that even though the global CLTs depend on the fourth cumulant of $x_{ij},$ the local CLTs do not. Based on these results, we propose two classes of statistics for testing the structures of $\Sigma,$ the global statistics and the local statistics, and analyze their superior power under general local alternatives. To our best knowledge, the local LSS testing statistics which do not rely on the fourth moment of $x_{ij},$ is used for the first time in hypothesis testing while the literature mostly uses the global statistics and requires the prior knowledge of the fourth cumulant. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate better performance compared to the existing methods in the literature.
翻译:本文在维数远大于样本量的假设下(即$p \asymp n^{\alpha}, \alpha>1$),考虑(未归一化的)样本协方差矩阵$Q = \Sigma^{1/2} XX^*\Sigma^{1/2}$,其中$X=(x_{ij})$为$p \times n$随机矩阵,其元素为独立同分布中心化随机变量,方差为$(pn)^{-1/2}$,而$\Sigma$为确定性总体协方差矩阵。我们建立了关于$Q$的线性谱统计量(LSS)的两类中心极限定理(CLT):宏观尺度上的全局CLT与介观尺度上的局部CLT。我们证明LSS收敛到某些高斯过程,其均值与协方差函数显式依赖于$\Sigma$、比值$p/n$及测试函数,且该结果在宏观与介观尺度上均成立。此外,我们进一步表明:全局CLT依赖于$x_{ij}$的四阶累积量,而局部CLT则与之无关。基于这些结果,我们提出了两类用于检验$\Sigma$结构的统计量——全局统计量与局部统计量,并分析了它们在一般局部备择假设下的优越功效。据我们所知,本文首次将不依赖$x_{ij}$四阶矩的局部LSS检验统计量应用于假设检验,而现有文献多采用全局统计量且需预知四阶累积量。数值模拟亦证实了所提统计量的准确性与高效性,并表明其性能优于文献中的现有方法。