Communication compression is an essential strategy for alleviating communication overhead by reducing the volume of information exchanged between computing nodes in large-scale distributed stochastic optimization. Although numerous algorithms with convergence guarantees have been obtained, the optimal performance limit under communication compression remains unclear. In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression. We focus on two main types of compressors, unbiased and contractive, and address the best-possible convergence rates one can obtain with these compressors. We establish the lower bounds for the convergence rates of distributed stochastic optimization in six different settings, combining strongly-convex, generally-convex, or non-convex functions with unbiased or contractive compressor types. To bridge the gap between lower bounds and existing algorithms' rates, we propose NEOLITHIC, a nearly optimal algorithm with compression that achieves the established lower bounds up to logarithmic factors under mild conditions. Extensive experimental results support our theoretical findings. This work provides insights into the theoretical limitations of existing compressors and motivates further research into fundamentally new compressor properties.
翻译:通信压缩是通过减少大规模分布式随机优化中计算节点间交换信息量来缓解通信开销的关键策略。尽管已有众多具备收敛保证的算法被提出,但通信压缩下的最优性能极限仍不明确。本文研究了采用通信压缩的分布式随机优化算法的性能极限。我们聚焦于无偏压缩器与压缩映射压缩器这两种主要类型,探讨了使用这些压缩器所能获得的最佳收敛速率。我们在六种不同设置下建立了分布式随机优化收敛速率的下界,这些设置结合了强凸、一般凸或非凸函数与无偏或压缩映射压缩器类型。为弥合下界与现有算法速率之间的差距,我们提出了NEOLITHIC算法——一种具备压缩能力的近乎最优算法,该算法在温和条件下能以对数因子逼近所建立的下界。大量实验结果支持了我们的理论发现。本工作揭示了现有压缩器的理论局限性,并为探索具有根本性新特性的压缩器提供了研究动机。