Symmetry-Aware Convex Shrinkage for High-Dimensional Covariance Estimation

We develop a class of data-adaptive shrinkage estimators for high-dimensional covariance estimation in which the shrinkage target is a Reynolds projection of the sample covariance under a finite symmetry group selected from a candidate library by held-out predictive performance. The class generalizes the convex shrinkage estimator of Ledoit and Wolf by replacing the scalar-identity target with a structured target derived from a symmetry group when one is available, and generalizes the group-symmetric maximum-likelihood estimator of Shah and Chandrasekaran by combining structural targeting with adaptive convex shrinkage and by selecting the group from data rather than treating it as prespecified. A two-tier procedure performs the group selection: a universal per-candidate evaluation based on held-out negative log-likelihood, optionally preceded by a domain-specific step that constructs the candidate library from structural priors. We establish a finite-sample regret bound for the held-out calibration of the convex combination weight, an oracle inequality for the data-driven group selection, and a quantitative sufficient-match condition under which the proposed estimator dominates Ledoit-Wolf shrinkage in Frobenius mean-squared error. The procedure is illustrated on six real-data problems spanning finance (S&P~500 daily returns), climate (NOAA OISST sea-surface temperature anomalies), genomics (TCGA-BRCA gene expression), radio signal processing (RadioML 2018.A), astronomical imaging (Galaxy10 DECaLS), and natural image patches (CIFAR-10 with a CIFAR-10.1 distribution-shift companion). An empirical comparison is also made against the Bayesian permutation-symmetry estimator of Chojecki and colleagues. Outside the few-shot regime, where structural priors carry the most information per observation, Ledoit-Wolf shrinkage remains the appropriate baseline.

翻译：针对高维协方差估计问题，本文提出一类数据自适应收缩估计方法。其核心思想是：将样本协方差矩阵在有限对称群（该群从候选库中通过留出预测性能进行选取）下的雷诺投影作为收缩目标。该方法通过引入结构化的对称群目标替代Ledoit-Wolf凸收缩估计中的标量单位矩阵目标，实现对该类方法的泛化；同时通过融合结构化目标与自适应凸收缩，并基于数据自主选取对称群（而非预先指定），将Shah与Chandrasekaran提出的群对称极大似然估计拓展至更广范畴。两阶段样本选取机制包括：基于留出负对数似然的通用候选评估阶段，以及可选的领域特定阶段（该阶段从结构先验中构建候选库）。本文建立了留出法校准凸组合权重的有限样本遗憾界、数据驱动群选取的Oracle不等式，以及定量刻画充分匹配条件的准则——在该条件下，所提估计器在Frobenius均方误差意义上优于Ledoit-Wolf收缩方法。通过六个实际数据案例验证方法有效性：金融领域（标普500指数日收益率）、气候科学（NOAA OISST海表温度异常）、基因组学（TCGA-BRCA基因表达）、无线电信号处理（RadioML 2018.A）、天文成像（Galaxy10 DECaLS）及自然图像块（含分布偏移参照集CIFAR-10.1的CIFAR-10数据集）。同时与Chojecki等学者提出的贝叶斯置换对称性估计器进行实证对比。结果表明：在结构先验信息量最充分的少样本场景外，Ledoit-Wolf收缩仍可作为有效的基准方法。