This paper studies the generalization performance of iterates obtained by Gradient Descent (GD), Stochastic Gradient Descent (SGD) and their proximal variants in high-dimensional robust regression problems. The number of features is comparable to the sample size and errors may be heavy-tailed. We introduce estimators that precisely track the generalization error of the iterates along the trajectory of the iterative algorithm. These estimators are provably consistent under suitable conditions. The results are illustrated through several examples, including Huber regression, pseudo-Huber regression, and their penalized variants with non-smooth regularizer. We provide explicit generalization error estimates for iterates generated from GD and SGD, or from proximal SGD in the presence of a non-smooth regularizer. The proposed risk estimates serve as effective proxies for the actual generalization error, allowing us to determine the optimal stopping iteration that minimizes the generalization error. Extensive simulations confirm the effectiveness of the proposed generalization error estimates.
翻译:本文研究了在高维鲁棒回归问题中,通过梯度下降法、随机梯度下降法及其近端变体获得的迭代序列的泛化性能。特征数量与样本规模相当,且误差可能具有重尾分布。我们提出了能够精确追踪迭代算法轨迹上各迭代点泛化误差的估计量,这些估计量在适当条件下具有可证明的一致性。研究结果通过多个示例进行说明,包括Huber回归、伪Huber回归及其带有非光滑正则项的惩罚变体。我们为梯度下降法、随机梯度下降法生成的迭代序列,以及在非光滑正则项存在时由近端随机梯度下降法生成的迭代序列,提供了显式的泛化误差估计。所提出的风险估计量可作为实际泛化误差的有效代理,使我们能够确定最小化泛化误差的最优停止迭代点。大量仿真实验验证了所提泛化误差估计方法的有效性。