We study properties of a sample covariance estimate $\widehat \Sigma = (\mathbf{X}_1 \mathbf{X}_1^\top + \ldots + \mathbf{X}_n \mathbf{X}_n^\top) / n$, where $\mathbf{X}_1, \dots, \mathbf{X}_n$ are i.i.d. random elements in $\mathbb R^d$ with $\mathbb E \mathbf{X}_1 = \mathbf{0}$, $\mathbb E \mathbf{X}_1 \mathbf{X}_1^\top = \Sigma$. We derive dimension-free bounds on the squared Frobenius norm of $(\widehat\Sigma - \Sigma)$ under reasonable assumptions. For instance, we show that $| \|\widehat\Sigma - \Sigma\|_{\rm F}^2 - \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2| = \mathcal O({\rm{Tr}}(\Sigma^2) / n)$ with overwhelming probability, which is a significant improvement over the existing results. This leads to a bound the ratio $\|\widehat\Sigma - \Sigma\|_{\rm F}^2 / \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2$ with a sharp leading constant when the effective rank $\mathtt{r}(\Sigma) = {\rm Tr}(\Sigma) / \|\Sigma\|$ and $n / \mathtt{r}(\Sigma)^6$ tend to infinity: $\|\widehat\Sigma - \Sigma\|_{\rm F}^2 / \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2 = 1 + \mathcal O(1 / \mathtt{r}(\Sigma))$.
翻译:我们研究样本协方差估计量 $\widehat \Sigma = (\mathbf{X}_1 \mathbf{X}_1^\top + \ldots + \mathbf{X}_n \mathbf{X}_n^\top) / n$ 的性质,其中 $\mathbf{X}_1, \dots, \mathbf{X}_n$ 是 $\mathbb R^d$ 中独立同分布的随机元,满足 $\mathbb E \mathbf{X}_1 = \mathbf{0}$,$\mathbb E \mathbf{X}_1 \mathbf{X}_1^\top = \Sigma$。在合理假设下,我们推导了 $(\widehat\Sigma - \Sigma)$ 的平方Frobenius范数的维数无关界。例如,我们证明在极高概率下有 $| \|\widehat\Sigma - \Sigma\|_{\rm F}^2 - \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2| = \mathcal O({\rm{Tr}}(\Sigma^2) / n)$,这是对现有结果的显著改进。这进一步导出了当有效秩 $\mathtt{r}(\Sigma) = {\rm Tr}(\Sigma) / \|\Sigma\|$ 与 $n / \mathtt{r}(\Sigma)^6$ 趋于无穷大时,比值 $\|\widehat\Sigma - \Sigma\|_{\rm F}^2 / \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2$ 的边界具有尖锐的前导常数:$\|\widehat\Sigma - \Sigma\|_{\rm F}^2 / \mathbb E \|\widehat\Sigma - \Sigma\|_{\rm F}^2 = 1 + \mathcal O(1 / \mathtt{r}(\Sigma))$。