Consider a data matrix $Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$ of size $M \times N$, where the columns are independent observations from a random vector $\mathbf{y}$ with zero mean and population covariance $\Sigma$. Let $\mathbf{u}_i$ and $\mathbf{v}_j$ denote the left and right singular vectors of $Y$, respectively. This study investigates the eigenvector/singular vector overlaps $\langle {\mathbf{u}_i, D_1 \mathbf{u}_j} \rangle$, $\langle {\mathbf{v}_i, D_2 \mathbf{v}_j} \rangle$ and $\langle {\mathbf{u}_i, D_3 \mathbf{v}_j} \rangle$, where $D_k$ are general deterministic matrices with bounded operator norms. We establish the convergence in probability of these eigenvector overlaps toward their deterministic counterparts with explicit convergence rates, when the dimension $M$ scales proportionally with the sample size $N$. Building on these findings, we offer a more precise characterization of the loss for Ledoit and Wolf's nonlinear shrinkage estimators of the population covariance $\Sigma$.
翻译:考虑一个大小为 $M \times N$ 的数据矩阵 $Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$,其列是来自零均值、总体协方差为 $\Sigma$ 的随机向量 $\mathbf{y}$ 的独立观测值。令 $\mathbf{u}_i$ 和 $\mathbf{v}_j$ 分别表示 $Y$ 的左奇异向量和右奇异向量。本研究考察特征向量/奇异向量重叠 $\langle {\mathbf{u}_i, D_1 \mathbf{u}_j} \rangle$、$\langle {\mathbf{v}_i, D_2 \mathbf{v}_j} \rangle$ 以及 $\langle {\mathbf{u}_i, D_3 \mathbf{v}_j} \rangle$,其中 $D_k$ 为具有有界算子范数的一般确定性矩阵。当维度 $M$ 与样本量 $N$ 成比例增长时,我们建立了这些特征向量重叠依概率收敛于其确定性对应量的结果,并给出了明确的收敛速率。基于这些发现,我们对 Ledoit 和 Wolf 提出的总体协方差 $\Sigma$ 的非线性收缩估计量的损失给出了更精确的表征。