In the linear regression model, the minimum l2-norm interpolant estimator has received much attention since it was proved to be consistent even though it fits noisy data perfectly under some condition on the covariance matrix $\Sigma$ of the input vector, known as benign overfitting. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our main results extend and improve the convergence rates as well as the deviation probability from [Tsigler and Bartlett]. Our proof differs from the classical bias/variance analysis and is based on the self-induced regularization property introduced in [Bartlett, Montanari and Rakhlin]: the minimum l2-norm interpolant estimator can be written as a sum of a ridge estimator and an overfitting component. The two geometrical properties of random Gaussian matrices at the heart of our analysis are the Dvoretsky-Milman theorem and isomorphic and restricted isomorphic properties. In particular, the Dvoretsky dimension appearing naturally in our geometrical viewpoint, coincides with the effective rank and is the key tool for handling the behavior of the design matrix restricted to the sub-space where overfitting happens. We extend these results to heavy-tailed scenarii proving the universality of this phenomenon beyond exponential moment assumptions. This phenomenon is unknown before and is widely believed to be a significant challenge. This follows from an anistropic version of the probabilistic Dvoretsky-Milman theorem that holds for heavy-tailed vectors which is of independent interest.
翻译:在线性回归模型中,最小$l_2$-范数插值估计器因其在输入向量协方差矩阵$\Sigma$满足特定条件时(即良性过拟合现象),即使完美拟合含噪数据仍能保持一致性而受到广泛关注。受此现象启发,我们从几何视角研究该估计器的泛化性质。主要结果拓展并改进了[Tsigler和Bartlett]中的收敛速率与偏差概率。我们的证明不同于经典的偏差/方差分析,而是基于[Bartlett, Montanari和Rakhlin]引入的自诱导正则化性质:最小$l_2$-范数插值估计器可表示为岭估计器与过拟合分量之和。支撑我们分析的核心随机高斯矩阵的两种几何性质分别为德沃列茨基-米尔曼定理以及同构与受限同构性质。特别地,几何视角中自然出现的德沃列茨基维度与有效秩一致,成为处理设计矩阵在过拟合子空间上行为的关键工具。我们将这些结果推广至重尾场景,证明该现象在指数矩假设之外的普适性。该现象此前未知,且被广泛认为是一项重大挑战。这一结论源于适用于重尾向量的各向异性概率型德沃列茨基-米尔曼定理——该定理本身具有独立研究价值。