We consider linear regression problems with a varying number of random projections, where we provably exhibit a double descent curve for a fixed prediction problem, with a high-dimensional analysis based on random matrix theory. We first consider the ridge regression estimator and re-interpret earlier results using classical notions from non-parametric statistics, namely degrees of freedom, also known as effective dimensionality. In particular, we show that the random design performance of ridge regression with a specific regularization parameter matches the classical bias and variance expressions coming from the easier fixed design analysis but for another larger implicit regularization parameter. We then compute asymptotic equivalents of the generalization performance (in terms of bias and variance) of the minimum norm least-squares fit with random projections, providing simple expressions for the double descent phenomenon.
翻译:我们考虑具有可变数量随机投影的线性回归问题,通过基于随机矩阵理论的高维分析,在固定预测问题上严格证明了双重下降曲线。首先研究岭回归估计量,利用非参数统计中的经典概念(即自由度,也称为有效维数)重新诠释了早期结果。特别地,我们证明,对于具有特定正则化参数的岭回归,随机设计下的性能与来自较易固定设计分析的经典偏差-方差表达式相匹配,但对应另一个更大的隐式正则化参数。随后计算了最小范数最小二乘拟合在随机投影下的泛化性能渐近等价形式(以偏差和方差表示),为双重下降现象提供了简洁的解析表达式。