We study operator-norm covariance estimation from heavy-tailed samples that may include a small fraction of arbitrary outliers. A simple and widely used safeguard is \emph{Euclidean norm clipping}, but its accuracy depends critically on an unknown clipping level. We propose a cross-fitted clipped covariance estimator equipped with \emph{fully computable} Bernstein-type deviation certificates, enabling principled data-driven tuning via a selector (\emph{MinUpper}) that balances certified stochastic error and a robust hold-out proxy for clipping bias. The resulting procedure adapts to intrinsic complexity measures such as effective rank under mild tail regularity and retains meaningful guarantees under only finite fourth moments. Experiments on contaminated spiked-covariance benchmarks illustrate stable performance and competitive accuracy across regimes.
翻译:我们研究从可能包含少量任意异常值的重尾样本中进行算子范数协方差估计。一种简单且广泛使用的保护措施是\emph{欧几里得范数截断},但其精度关键取决于未知的截断水平。我们提出了一种配备\emph{完全可计算}Bernstein型偏差证书的交叉拟合截断协方差估计器,通过一个平衡经认证的随机误差与截断偏差的稳健留出代理的选择器(\emph{MinUpper}),实现了基于数据驱动的原则性调参。所得方法在温和的尾部正则性条件下适应于有效秩等内在复杂性度量,并且仅在有限四阶矩条件下仍保持有意义的理论保证。在受污染的尖峰协方差基准测试上的实验表明,该方法在不同机制下均表现出稳定的性能和具有竞争力的精度。