This paper studies the generalization performance of iterates obtained by Gradient Descent (GD), Stochastic Gradient Descent (SGD) and their proximal variants in high-dimensional robust regression problems. The number of features is comparable to the sample size and errors may be heavy-tailed. We introduce estimators that precisely track the generalization error of the iterates along the trajectory of the iterative algorithm. These estimators are provably consistent under suitable conditions. The results are illustrated through several examples, including Huber regression, pseudo-Huber regression, and their penalized variants with non-smooth regularizer. We provide explicit generalization error estimates for iterates generated from GD and SGD, or from proximal SGD in the presence of a non-smooth regularizer. The proposed risk estimates serve as effective proxies for the actual generalization error, allowing us to determine the optimal stopping iteration that minimizes the generalization error. Extensive simulations confirm the effectiveness of the proposed generalization error estimates.
翻译:本文研究了在高维鲁棒回归问题中,通过梯度下降法(GD)、随机梯度下降法(SGD)及其近端变体获得的迭代点的泛化性能。特征数量与样本规模相当,且误差可能具有重尾性。我们引入了能够精确追踪迭代算法轨迹上各迭代点泛化误差的估计量。在适当条件下,这些估计量被证明具有一致性。研究结果通过多个示例进行说明,包括Huber回归、伪Huber回归及其带有非光滑正则化项的惩罚变体。我们为GD、SGD生成的迭代点,或在存在非光滑正则化项时由近端SGD生成的迭代点,提供了显式的泛化误差估计。所提出的风险估计量可作为实际泛化误差的有效代理,使我们能够确定最小化泛化误差的最佳停止迭代次数。大量仿真实验证实了所提泛化误差估计方法的有效性。