We present an estimator of the covariance matrix $\Sigma$ of random $d$-dimensional vector from an i.i.d. sample of size $n$. Our sole assumption is that this vector satisfies a bounded $L^p-L^2$ moment assumption over its one-dimensional marginals, for some $p\geq 4$. Given this, we show that $\Sigma$ can be estimated from the sample with the same high-probability error rates that the sample covariance matrix achieves in the case of Gaussian data. This holds even though we allow for very general distributions that may not have moments of order $>p$. Moreover, our estimator can be made to be optimally robust to adversarial contamination. This result improves the recent contributions by Mendelson and Zhivotovskiy and Catoni and Giulini, and matches parallel work by Abdalla and Zhivotovskiy (the exact relationship with this last work is described in the paper).
翻译:我们提出了一种协方差矩阵$\Sigma$的估计方法,该矩阵基于随机$d$维向量的$n$个独立同分布样本。我们唯一假设是该向量在其一维边际分布上满足有界$L^p-L^2$矩条件,其中$p\geq 4$。基于此,我们证明即使允许分布非常一般(可能不存在$>p$阶矩),该估计量仍能达到与高斯数据情况下样本协方差矩阵相同的高概率误差率。此外,我们的估计量可实现对对抗性污染的最优稳健性。这一结果改进了Mendelson与Zhivotovskiy以及Catoni与Giulini的近期工作,并与Abdalla和Zhivotovskiy的并行研究相匹配(本文详细描述了与该最新工作的确切关联)。