In this paper, we consider solving the distributed optimization problem over a multi-agent network under the communication restricted setting. We study a compressed decentralized stochastic gradient method, termed ``compressed exact diffusion with adaptive stepsizes (CEDAS)", and show the method asymptotically achieves comparable convergence rate as centralized { stochastic gradient descent (SGD)} for both smooth strongly convex objective functions and smooth nonconvex objective functions under unbiased compression operators. In particular, to our knowledge, CEDAS enjoys so far the shortest transient time (with respect to the graph specifics) for achieving the convergence rate of centralized SGD, which behaves as $\mathcal{O}(n{C^3}/(1-\lambda_2)^{2})$ under smooth strongly convex objective functions, and $\mathcal{O}(n^3{C^6}/(1-\lambda_2)^4)$ under smooth nonconvex objective functions, where $(1-\lambda_2)$ denotes the spectral gap of the mixing matrix, and $C>0$ is the compression-related parameter. In particular, CEDAS exhibits the shortest transient times when $C < \mathcal{O}(1/(1 - \lambda_2)^2)$, which is common in practice. Numerical experiments further demonstrate the effectiveness of the proposed algorithm.
翻译:本文研究通信受限设置下多智能体网络中的分布式优化问题。我们提出一种压缩去中心化随机梯度方法,称为"带自适应步长的压缩精确扩散算法(CEDAS)",并证明在无偏压缩算子条件下,该方法对于光滑强凸目标函数和光滑非凸目标函数均能渐近达到与中心化随机梯度下降(SGD)相当的收敛速率。特别地,据我们所知,CEDAS在实现中心化SGD收敛速率方面具有迄今为止最短的瞬态时间(相对于图结构参数),其在光滑强凸目标函数下表现为$\mathcal{O}(n{C^3}/(1-\lambda_2)^{2})$,在光滑非凸目标函数下表现为$\mathcal{O}(n^3{C^6}/(1-\lambda_2)^4)$,其中$(1-\lambda_2)$表示混合矩阵的谱间隙,$C>0$为压缩相关参数。值得注意的是,当$C < \mathcal{O}(1/(1 - \lambda_2)^2)$时(该条件在实践中普遍成立),CEDAS展现出最短的瞬态时间。数值实验进一步验证了所提算法的有效性。