High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods. To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective and incorporate structured group sparsity regularization, a natural generalization of the lasso, yielding a group lasso regularized rank regression method. By extending the tuning-free parameter selection scheme originally developed for the lasso, we introduce a data-driven, simulation-based tuning rule and further establish a finite-sample error bound for the resulting estimator. On the computational side, we develop a proximal augmented Lagrangian method for solving the associated optimization problem, which eliminates the singularity issues encountered in existing methods, thereby enabling efficient semismooth Newton updates for the subproblems. Extensive numerical experiments demonstrate the robustness and effectiveness of our proposed estimator against alternatives, and showcase the scalability of the algorithm across both simulated and real-data settings.
翻译:高维回归常受重尾噪声和异常值影响,这会严重削弱基于最小二乘方法的可靠性。为提高稳健性,我们采用基于非光滑Wilcoxon得分的秩目标函数,并结合结构化组稀疏正则化——套索的自然推广,从而提出一种组套索正则化秩回归方法。通过扩展最初为套索开发的无调参选择方案,我们引入了一种数据驱动的、基于模拟的调参规则,并进一步为所得估计量建立了有限样本误差界。在计算方面,我们开发了一种近端增广拉格朗日方法来解决相关优化问题,该方法消除了现有方法中遇到的奇异性问题,从而能够对子问题进行高效的半光滑牛顿更新。大量数值实验证明了我们提出的估计量相较于替代方法的稳健性和有效性,并展示了该算法在模拟和真实数据场景下的可扩展性。