Distributed empirical risk minimization (ERM) is often studied through two influential yet seemingly separate families of methods: CoCoA-type algorithms, derived from distributed dual coordinate ascent, and ADMM-type algorithms, derived from consensus and proximal splitting. In this paper, we investigate the connection of the two types of algorithms from a unified primal-dual perspective. We show that consensus ADMM, linearized consensus ADMM, two distributed proximal ADMM variants, and ridge-regularized CoCoA can all be written in a common update form involving a global primal variable and block dual variables. This reformulation makes several previously hidden connections explicit: For ridge-regularized ERM, CoCoA coincides with a particular proximal ADMM scheme at the level of the dual update. Moreover, consensus ADMM on the primal problem is equivalent to proximal ADMM on the dual problem under an explicit parameter mapping together with a sign reversal of the saddle objective; similar correspondences also hold for the linearized variants.These results indicates that the ADMM-type algorithms, when fine tuned, performs at least as good as CoCoA, under ridge regularized ERM problems. The unified view also yields a natural primal-dual gap stopping criterion for consensus ADMM and a unified $O(1/T)$ ergodic convergence analysis for the ADMM-type methods. Experiments on synthetic regression problems and real SVM datasets support the predicted relationships, clarify the role of tuning parameters, and show that suitably tuned ADMM variants can outperform CoCoA in the ridge-regularized setting.
翻译:分布式经验风险最小化(ERM)通常通过两类影响力广泛但看似独立的方法加以研究:一类是从分布式对偶坐标上升导出的CoCoA型算法,另一类是基于共识与近端分裂的ADMM型算法。本文从统一的原始-对偶视角探究这两类算法之间的关联。我们证明,共识ADMM、线性化共识ADMM、两种分布式近端ADMM变体以及岭正则化CoCoA均可写成一种涉及全局原始变量和分块对偶变量的通用更新形式。这种重构使得若干此前隐藏的关联变得明确:对于岭正则化ERM,CoCoA在对偶更新层面上与特定近端ADMM方案完全一致;此外,原始问题上的共识ADMM等价于对偶问题上的近端ADMM,且两者之间存在显式参数映射与鞍点目标函数的符号反转;类似的对应关系也适用于线性化变体。这些结果表明,在岭正则化ERM问题中,经过精细调参的ADMM型算法性能至少与CoCoA相当。该统一视角还为共识ADMM提供了自然的原始-对偶间隙终止准则,并为ADMM型方法建立了统一的$O(1/T)$遍历收敛性分析。在合成回归问题与真实SVM数据集上的实验验证了上述预测关系,阐明了调参参数的作用,并表明适当调参的ADMM变体在岭正则化设定下可优于CoCoA。