In recent years, there has been a significant growth in research focusing on minimum $\ell_2$ norm (ridgeless) interpolation least squares estimators. However, the majority of these analyses have been limited to an unrealistic regression error structure, assuming independent and identically distributed errors with zero mean and common variance. In this paper, we explore prediction risk as well as estimation risk under more general regression error assumptions, highlighting the benefits of overparameterization in a more realistic setting that allows for clustered or serial dependence. Notably, we establish that the estimation difficulties associated with the variance components of both risks can be summarized through the trace of the variance-covariance matrix of the regression errors. Our findings suggest that the benefits of overparameterization can extend to time series, panel and grouped data.
翻译:近年来,针对最小$\ell_2$范数(无脊)插值最小二乘估计量的研究显著增长。然而,这些分析大多局限于不切实际的回归误差结构,假设误差独立同分布、均值为零且方差齐性。本文在更一般的回归误差假设下探讨预测风险与估计风险,强调在允许聚类或序列依赖的更现实场景中过参数化的优势。值得注意的是,我们证明与两种风险的方差分量相关的估计困难可通过回归误差的方差-协方差矩阵的迹进行概括。研究结果表明,过参数化的优势可扩展到时间序列、面板数据和分组数据。