Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems

In this paper, we consider the decentralized, stochastic nonconvex strongly-concave (NCSC) minimax problem with nonsmooth regularization terms on both primal and dual variables, wherein a network of $m$ computing agents collaborate via peer-to-peer communications. We consider when the coupling function is in expectation or finite-sum form and the double regularizers are convex functions, applied separately to the primal and dual variables. Our algorithmic framework introduces a Lagrangian multiplier to eliminate the consensus constraint on the dual variable. Coupling this with variance-reduction (VR) techniques, our proposed method, entitled VRLM, by a single neighbor communication per iteration, is able to achieve an $\mathcal{O}(\kappa^3\varepsilon^{-3})$ sample complexity under the general stochastic setting, with either a big-batch or small-batch VR option, where $\kappa$ is the condition number of the problem and $\varepsilon$ is the desired solution accuracy. With a big-batch VR, we can additionally achieve $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity. Under the special finite-sum setting, our method with a big-batch VR can achieve an $\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$ sample complexity and $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity, where $n$ is the number of components in the finite sum. All complexity results match the best-known results achieved by a few existing methods for solving special cases of the problem we consider. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general convex nonsmooth regularizers applied to both the primal and dual variables in the decentralized stochastic setting. Numerical experiments are conducted on two machine learning problems. Our code is downloadable from https://github.com/RPI-OPT/VRLM.

翻译：本文研究去中心化随机非凸强凹（NCSC）极小极大问题，其中原始变量和对偶变量均带有非光滑正则项，并且由$m$个计算代理通过网络通过点对点通信协作。我们考虑耦合函数为期望形式或有限和形式，且双重正则化项为分别作用于原始变量和对偶变量的凸函数。我们提出的算法框架引入拉格朗日乘子以消除对偶变量上的共识约束。将此与方差缩减（VR）技术相结合，我们提出的方法VRLM通过每次迭代的单邻居通信，在一般随机设置下能够实现$\mathcal{O}(\kappa^3\varepsilon^{-3})$的样本复杂度（采用大批次或小批次VR选项），其中$\kappa$为问题的条件数，$\varepsilon$为目标求解精度。采用大批次VR时，我们还能额外实现$\mathcal{O}(\kappa^2\varepsilon^{-2})$的通信复杂度。在特殊的有限和设置下，我们的方法结合大批次VR可实现$\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$的样本复杂度和$\mathcal{O}(\kappa^2\varepsilon^{-2})$的通信复杂度，其中$n$为有限和中的分量数量。所有复杂度结果均与现有方法在求解我们所考虑问题的特例时取得的最佳结果相匹配。据我们所知，这是首个为去中心化随机设置下原始变量和对偶变量均带一般凸非光滑正则项的NCSC极小极大问题提供收敛性保证的工作。我们在两个机器学习问题上进行了数值实验。代码可从https://github.com/RPI-OPT/VRLM下载。