In this paper, we consider solving the distributed optimization problem over a multi-agent network under the communication restricted setting. We study a compressed decentralized stochastic gradient method, termed ``compressed exact diffusion with adaptive stepsizes (CEDAS)", and show the method asymptotically achieves comparable convergence rate as centralized { stochastic gradient descent (SGD)} for both smooth strongly convex objective functions and smooth nonconvex objective functions under unbiased compression operators. In particular, to our knowledge, CEDAS enjoys so far the shortest transient time (with respect to the graph specifics) for achieving the convergence rate of centralized SGD, which behaves as $\mathcal{O}(n{C^3}/(1-\lambda_2)^{2})$ under smooth strongly convex objective functions, and $\mathcal{O}(n^3{C^6}/(1-\lambda_2)^4)$ under smooth nonconvex objective functions, where $(1-\lambda_2)$ denotes the spectral gap of the mixing matrix, and $C>0$ is the compression-related parameter. Numerical experiments further demonstrate the effectiveness of the proposed algorithm.
翻译:本文考虑在通信受限环境下解决多智能体网络上的分布式优化问题。我们研究了一种压缩去中心化随机梯度方法,称为“带自适应步长的压缩精确扩散(CEDAS)”,并证明了该方法在无偏压缩算子作用下,对于光滑强凸目标函数和光滑非凸目标函数,其渐近收敛速度可与集中式随机梯度下降(SGD)相媲美。特别地,据我们所知,CEDAS在达到集中式SGD收敛速度方面,目前具有最短的暂态时间(关于图特性),其中在光滑强凸目标函数下为$\mathcal{O}(n{C^3}/(1-\lambda_2)^{2})$,在光滑非凸目标函数下为$\mathcal{O}(n^3{C^6}/(1-\lambda_2)^4)$,这里$(1-\lambda_2)$表示混合矩阵的谱间隙,$C>0$为压缩相关参数。数值实验进一步验证了所提算法的有效性。