In distributed optimization, a large number of machines alternate between local computations and communication with a coordinating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce this burden and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. We propose CompressedScaffnew, the first algorithm for distributed optimization that jointly harnesses these two strategies and converges linearly to an exact solution in the strongly convex setting, with a doubly accelerated rate: it benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.
翻译:在分布式优化中,大量机器在本地计算与协调服务器通信之间交替进行。通信过程既缓慢又昂贵,是此场景下的主要瓶颈。为减轻这一负担并加速分布式梯度下降,两种策略广受欢迎:1)降低通信频率,即在通信轮次间执行多次本地计算;2)传输压缩信息而非全维向量。我们提出压缩脚手架新,这是首个联合利用这两种策略的分布式优化算法,在强凸设定下能以双加速速率线性收敛至精确解:该算法同时受益于本地训练与压缩提供的两种加速机制,分别实现了对函数条件数及模型维度的更优依赖性。