We establish precise structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic functionals of subsample ridge estimators, when fitted with different ridge regularization levels $\lambda$ and subsample aspect ratios $\psi$, are asymptotically equivalent along specific paths in the $(\lambda,\psi)$-plane (where $\psi$ is the ratio of the feature dimension to the subsample size). Our results only require bounded moment assumptions on feature and response distributions and allow for arbitrary joint distributions. Furthermore, we provide a data-dependent method to determine the equivalent paths of $(\lambda,\psi)$. An indirect implication of our equivalences is that optimally tuned ridge regression exhibits a monotonic prediction risk in the data aspect ratio. This resolves a recent open problem raised by Nakkiran et al. for general data distributions under proportional asymptotics, assuming a mild regularity condition that maintains regression hardness through linearized signal-to-noise ratios.
翻译:我们建立了集成岭估计器中子采样与岭正则化之间精确的结构和风险等价性。具体来说,我们证明了在不同岭正则化水平 $\lambda$ 和子采样宽高比 $\psi$ 下,子采样岭估计器的线性和二次泛函在 $(\lambda,\psi)$ 平面(其中 $\psi$ 为特征维度与子采样大小的比值)的特定路径上渐近等价。我们的结果仅需要特征和响应分布的有界矩假设,并允许任意联合分布。此外,我们提供了一种数据依赖的方法来确定 $(\lambda,\psi)$ 的等价路径。这些等价性隐含的一个间接结论是,最优调优的岭回归在数据宽高比上表现出单调的预测风险。这解决了 Nakkiran 等人针对比例渐近性下一般数据分布提出的一个近期开放问题,该问题假设了一个温和的正则条件,通过线性化信噪比来保持回归的难度。