Communication efficiency has garnered significant attention as it is considered the main bottleneck for large-scale decentralized Machine Learning applications in distributed and federated settings. In this regime, clients are restricted to transmitting small amounts of quantized information to their neighbors over a communication graph. Numerous endeavors have been made to address this challenging problem by developing algorithms with compressed communication for decentralized non-convex optimization problems. Despite considerable efforts, the current results suffer from various issues such as non-scalability with the number of clients, requirements for large batches, or bounded gradient assumption. In this paper, we introduce MoTEF, a novel approach that integrates communication compression with Momentum Tracking and Error Feedback. Our analysis demonstrates that MoTEF achieves most of the desired properties, and significantly outperforms existing methods under arbitrary data heterogeneity. We provide numerical experiments to validate our theoretical findings and confirm the practical superiority of MoTEF.
翻译:通信效率作为分布式与联邦场景下大规模去中心化机器学习应用的主要瓶颈,已引起广泛关注。在此场景中,客户端被限制通过通信图向相邻节点传输少量量化信息。为应对这一挑战,研究者已付出诸多努力,开发了面向去中心化非凸优化问题的压缩通信算法。尽管已有大量尝试,现有成果仍存在诸多局限,例如无法随客户端数量扩展、需要大批次数据或依赖梯度有界假设等。本文提出MoTEF方法,创新性地将通信压缩与动量追踪及误差反馈机制相结合。理论分析表明,MoTEF具备绝大多数理想特性,且在任意数据异质性条件下显著优于现有方法。我们通过数值实验验证了理论结论,并证实了MoTEF的实际优越性。