高效分组Lasso正则化秩回归及其数据驱动参数确定方法 (Efficient Group Lasso Regularized Rank Regression with Data-Driven Parameter Determination)

High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods. To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective and incorporate structured group sparsity regularization, a natural generalization of the lasso, yielding a group lasso regularized rank regression method. By extending the tuning-free parameter selection scheme originally developed for the lasso, we introduce a data-driven, simulation-based tuning rule and further establish a finite-sample error bound for the resulting estimator. On the computational side, we develop a proximal augmented Lagrangian method for solving the associated optimization problem, which eliminates the singularity issues encountered in existing methods, thereby enabling efficient semismooth Newton updates for the subproblems. Extensive numerical experiments demonstrate the robustness and effectiveness of our proposed estimator against alternatives, and showcase the scalability of the algorithm across both simulated and real-data settings.

翻译：高维回归常受重尾噪声和异常值干扰，这会严重削弱基于最小二乘法估计的可靠性。为提升稳健性，我们采用基于非光滑Wilcoxon得分的秩目标函数，并结合结构化分组稀疏正则化——Lasso方法的自然推广，从而提出一种分组Lasso正则化秩回归方法。通过扩展最初为Lasso设计的免调参选择方案，我们引入了一种数据驱动的、基于模拟的调参准则，并进一步建立了所得估计量的有限样本误差界。在计算方面，我们开发了求解相关优化问题的近端增广拉格朗日方法，该方法消除了现有方法中遇到的奇异性问题，从而能够对子问题实现高效的半光滑牛顿更新。大量数值实验证明了我们提出的估计量相较于替代方法的稳健性和有效性，并展示了该算法在模拟和真实数据场景下的可扩展性。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日