The parallel alternating direction method of multipliers (ADMM) algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined (composite) regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. In this paper, we propose a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems and derive the corresponding parallel ADMM algorithms. Furthermore, we prove that the proposed algorithm not only has global convergence but also exhibits linear convergence rate. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithms can be obtained at https://github.com/xfwu1016/CPADMM.
翻译:并行交替方向乘子法(ADMM)因其在分布式存储的大规模数据集处理中的有效性而被广泛认可,成为求解统计学习模型的常用方法。然而,目前针对结合(复合)正则化项的高维回归的并行算法研究仍十分有限。弹性网、稀疏组套索、稀疏融合套索及其非凸变体等正则化项,因其能够融合先验信息并促进特定分组或融合变量的稀疏性,在各领域备受关注。此类组合正则化并行算法的稀缺可归因于这些项固有的非光滑性与复杂性,以及部分相关近端算子缺乏闭式解。本文针对此类凸与非凸回归问题,提出一种基于共识问题的统一约束优化框架,并推导出相应的并行ADMM算法。进一步证明,所提算法不仅具有全局收敛性,还具备线性收敛速率。大量仿真实验与金融案例验证了算法的可靠性、稳定性与可扩展性。实现所提算法的R包可从https://github.com/xfwu1016/CPADMM获取。