High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods. To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective and incorporate structured group sparsity regularization, a natural generalization of the lasso, yielding a group lasso regularized rank regression method. By extending the tuning-free parameter selection scheme originally developed for the lasso, we introduce a data-driven, simulation-based tuning rule and further establish a finite-sample error bound for the resulting estimator. On the computational side, we develop a proximal augmented Lagrangian method for solving the associated optimization problem, which eliminates the singularity issues encountered in existing methods, thereby enabling efficient semismooth Newton updates for the subproblems. Extensive numerical experiments demonstrate the robustness and effectiveness of our proposed estimator against alternatives, and showcase the scalability of the algorithm across both simulated and real-data settings.
翻译:高维回归常受重尾噪声和异常值干扰,这会严重削弱基于最小二乘法估计的可靠性。为提升稳健性,我们采用基于非光滑Wilcoxon得分的秩目标函数,并结合结构化分组稀疏正则化——Lasso方法的自然推广,从而提出一种分组Lasso正则化秩回归方法。通过扩展最初为Lasso设计的免调参选择方案,我们引入了一种数据驱动的、基于模拟的调参准则,并进一步建立了所得估计量的有限样本误差界。在计算方面,我们开发了求解相关优化问题的近端增广拉格朗日方法,该方法消除了现有方法中遇到的奇异性问题,从而能够对子问题实现高效的半光滑牛顿更新。大量数值实验证明了我们提出的估计量相较于替代方法的稳健性和有效性,并展示了该算法在模拟和真实数据场景下的可扩展性。