We consider primal-dual algorithms for general empirical risk minimization problems in distributed settings, focusing on two prominent classes of algorithms. The first class is the communication-efficient distributed dual coordinate ascent (CoCoA), derived from the coordinate ascent method for solving the dual problem. The second class is the alternating direction method of multipliers (ADMM), including consensus ADMM, proximal ADMM, and linearized ADMM. We demonstrate that both classes of algorithms can be transformed into a unified update form that involves only primal and dual variables. This discovery reveals key connections between the two classes of algorithms: CoCoA can be interpreted as a special case of proximal ADMM for solving the dual problem, while consensus ADMM is equivalent to a proximal ADMM algorithm. This discovery provides insight into how we can easily enable the ADMM variants to outperform the CoCoA variants by adjusting the augmented Lagrangian parameter. We further explore linearized versions of ADMM and analyze the effects of tuning parameters on these ADMM variants in the distributed setting. Extensive simulation studies and real-world data analysis support our theoretical findings.
翻译:本文研究分布式环境下一般经验风险最小化问题的原始-对偶算法,重点关注两类主要算法。第一类是通信高效的分布式对偶坐标上升法(CoCoA),该方法源于求解对偶问题的坐标上升法。第二类是乘子交替方向法(ADMM),包括共识ADMM、邻近ADMM及线性化ADMM。我们证明这两类算法均可转化为仅包含原始变量与对偶变量的统一更新形式。这一发现揭示了两类算法间的关键联系:CoCoA可解释为求解对偶问题的邻近ADMM特例,而共识ADMM等价于某种邻近ADMM算法。该发现为通过调整增广拉格朗日参数使ADMM变体性能超越CoCoA变体提供了理论依据。我们进一步探究ADMM的线性化版本,并分析分布式场景中调参对各类ADMM变体的影响。大量仿真研究与实际数据分析验证了我们的理论发现。