Let the dimension $N$ of data and the sample size $T$ tend to $\infty$ with $N/T \to c > 0$. The spectral properties of a sample correlation matrix $\mathbf{C}$ and a sample covariance matrix $\mathbf{S}$ are asymptotically equal whenever the population correlation matrix $\mathbf{R}$ is bounded (El Karoui 2009). We demonstrate this also for general linear models for unbounded $\mathbf{R}$, by examining the behavior of the singular values of multiplicatively perturbed matrices. By this, we establish: Given a factor model of an idiosyncratic noise variance $\sigma^2$ and a rank-$r$ factor loading matrix $\mathbf{L}$ which rows all have common Euclidean norm $L$. Then, the $k$th largest eigenvalues $\lambda_k$ $(1\le k\le N)$ of $\mathbf{C}$ satisfy almost surely: (1) $\lambda_r$ diverges, (2) $\lambda_k/s_k^2\to1/(L^2 + \sigma^2)$ $(1 \le k \le r)$ for the $k$th largest singular value $s_k$ of $\mathbf{L}$, and (3) $\lambda_{r + 1}\to(1-\rho)(1+\sqrt{c})^2$ for $\rho := L^2/(L^2 + \sigma^2)$. Whenever $s_r$ is much larger than $\sqrt{\log N}$, then broken-stick rule (Frontier 1976, Jackson 1993), which estimates $\mathrm{rank}\, \mathbf{L}$ by a random partition (Holst 1980) of $[0,\,1]$, tends to $r$ (a.s.). We also provide a natural factor model where the rule tends to "essential rank" of $\mathbf{L}$ (a.s.) which is smaller than $\mathrm{rank}\, \mathbf{L}$.
翻译:令数据维度$N$与样本量$T$趋于$\infty$且满足$N/T \to c > 0$。当总体相关矩阵$\mathbf{R}$有界时,样本相关矩阵$\mathbf{C}$与样本协方差矩阵$\mathbf{S}$的谱性质渐近相等(El Karoui 2009)。本文通过研究乘性扰动矩阵奇异值的渐近行为,将此结论推广至无界$\mathbf{R}$的一般线性模型。基于此,我们建立以下结论:考虑一个具有特质噪声方差$\sigma^2$和秩为$r$的因子载荷矩阵$\mathbf{L}$的因子模型,其中$\mathbf{L}$的所有行具有相同的欧几里得范数$L$。则$\mathbf{C}$的第$k$大特征值$\lambda_k$ $(1\le k\le N)$几乎必然满足:(1)$\lambda_r$发散;(2)对于$\mathbf{L}$的第$k$大奇异值$s_k$,有$\lambda_k/s_k^2\to1/(L^2 + \sigma^2)$ $(1 \le k \le r)$;(3)$\lambda_{r + 1}\to(1-\rho)(1+\sqrt{c})^2$,其中$\rho := L^2/(L^2 + \sigma^2)$。当$s_r$远大于$\sqrt{\log N}$时,通过随机划分区间$[0,\,1]$(Holst 1980)来估计$\mathrm{rank}\, \mathbf{L}$的折棍法则(Frontier 1976, Jackson 1993)几乎必然收敛于$r$。我们还构建了一个自然因子模型,其中该法则几乎必然收敛于$\mathbf{L}$的"本质秩",该值小于$\mathrm{rank}\, \mathbf{L}$。