Consider a data matrix $Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$ of size $M \times N$, where the columns are independent observations from a random vector $\mathbf{y}$ with zero mean and population covariance $\Sigma$. Let $\mathbf{u}_i$ and $\mathbf{v}_j$ denote the left and right singular vectors of $Y$, respectively. This study investigates the eigenvector/singular vector overlaps $\langle {\mathbf{u}_i, D_1 \mathbf{u}_j} \rangle$, $\langle {\mathbf{v}_i, D_2 \mathbf{v}_j} \rangle$ and $\langle {\mathbf{u}_i, D_3 \mathbf{v}_j} \rangle$, where $D_k$ are general deterministic matrices with bounded operator norms. We establish the convergence in probability of these eigenvector overlaps toward their deterministic counterparts with explicit convergence rates, when the dimension $M$ scales proportionally with the sample size $N$. Building on these findings, we offer a more precise characterization of the loss for Ledoit and Wolf's nonlinear shrinkage estimators of the population covariance $\Sigma$.
翻译:考虑一个大小为$M \times N$的数据矩阵$Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$,其中各列是来自均值为零且总体协方差为$\Sigma$的随机向量$\mathbf{y}$的独立观测。令$\mathbf{u}_i$和$\mathbf{v}_j$分别表示$Y$的左右奇异向量。本研究探讨特征向量/奇异向量重叠$\langle {\mathbf{u}_i, D_1 \mathbf{u}_j} \rangle$、$\langle {\mathbf{v}_i, D_2 \mathbf{v}_j} \rangle$和$\langle {\mathbf{u}_i, D_3 \mathbf{v}_j} \rangle$,其中$D_k$为具有有界算子范数的一般确定性矩阵。当维度$M$与样本量$N$成比例增长时,我们建立了这些特征向量重叠以其确定对应物为极限的概率收敛性,并给出了显式收敛速率。基于这些发现,我们对Ledoit和Wolf提出的总体协方差$\Sigma$的非线性收缩估计量的损失提供了更精确的表征。