Modern deep learning has revealed a surprising statistical phenomenon known as benign overfitting, with high-dimensional linear regression being a prominent example. This paper contributes to ongoing research on the ordinary least squares (OLS) interpolator, focusing on the partial regression setting, where only a subset of coefficients is implicitly regularized. On the algebraic front, we extend Cochran's formula and the leave-one-out residual formula for the partial regularization framework. On the stochastic front, we leverage our algebraic results to design several homoskedastic variance estimators under the Gauss-Markov model. These estimators serve as a basis for conducting statistical inference, albeit with slight conservatism in their performance. Through simulations, we study the finite-sample properties of these variance estimators across various generative models.
翻译:现代深度学习揭示了一种被称为良性过拟合的惊人统计现象,其中高维线性回归是一个突出示例。本文对普通最小二乘(OLS)插值器的持续研究作出贡献,重点关注部分回归设定——即仅部分系数被隐式正则化的情形。在代数层面,我们将Cochran公式和留一残差公式推广至部分正则化框架。在随机性层面,我们利用代数结果设计了高斯-马尔可夫模型下的若干同方差方差估计量。这些估计量虽在性能上略显保守,但为开展统计推断提供了基础。通过模拟实验,我们在多种生成模型下研究了这些方差估计量的有限样本性质。