We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max characterization of key statistics, enabling approximation of the mean $μ_{\hatθ}$ and covariance $C_{\hatθ}$ of the ERM estimator $\hatθ$. Specifically, under a concentration assumption on the data matrix and standard regularity conditions on the loss and regularizer, we show that for a test covariate $x$ independent of the training data, the projection $\hatθ^\top x$ approximately follows the convolution of the generally non-Gaussian distribution of $μ_{\hatθ}^\top x$ with an independent centered Gaussian variable of variance $\mathrm{tr}(C_{\hatθ} \mathbb{E}[xx^\top])$. This result clarifies the scope and limits of Gaussian universality for ERMs. Additionally, we prove that any $\mathcal{C}^2$ regularizer is asymptotically equivalent to a quadratic form determined solely by its Hessian at zero and gradient at $μ_{\hatθ}$. Numerical simulations across diverse losses and models are provided to validate our theoretical predictions and qualitative insights.
翻译:我们研究了高维凸经验风险最小化(ERM)在一般非高斯数据设计下的性质。通过将凸高斯极小极大定理(CGMT)启发式地推广到非高斯场景,我们推导出关键统计量的渐近极小极大特征,从而能够近似ERM估计量$\hatθ$的均值$μ_{\hatθ}$和协方差$C_{\hatθ}$。具体而言,在数据矩阵的集中性假设以及损失函数和正则项的常规正则性条件下,我们证明:对于独立于训练数据的测试协变量$x$,投影$\hatθ^\top x$近似服从$μ_{\hatθ}^\top x$的广义非高斯分布与方差为$\mathrm{tr}(C_{\hatθ} \mathbb{E}[xx^\top])$的独立中心高斯变量之卷积。这一结果阐明了ERM高斯普适性的适用范围与局限性。此外,我们证明任意$\mathcal{C}^2$类正则项渐近等价于一个由其零点处的Hessian矩阵及其在$μ_{\hatθ}$处的梯度唯一确定的二次型。我们通过跨多种损失函数与模型的数值模拟验证了理论预测与定性结论。