Distributed learning is an effective approach to accelerate model training using multiple workers. However, substantial communication delays emerge between workers and a parameter server due to massive costs associated with communicating gradients. SignSGD with majority voting (signSGD-MV) is a simple yet effective optimizer that reduces communication costs through one-bit quantization, yet the convergence rates considerably decrease as adversarial workers increase. In this paper, we show that the convergence rate is invariant as the number of adversarial workers increases, provided that the number of adversarial workers is smaller than that of benign workers. The key idea showing this counter-intuitive result is our novel signSGD with federated defense (signSGD-FD). Unlike the traditional approaches, signSGD-FD exploits the gradient information sent by adversarial workers with the proper weights, which are obtained through gradient sign decoding. Experimental results demonstrate signSGD-FD achieves superior convergence rates over traditional algorithms in various adversarial attack scenarios.
翻译:分布式学习是利用多个工作节点加速模型训练的有效方法。然而,由于梯度通信的巨额成本,工作节点与参数服务器之间会出现显著的通信延迟。基于多数投票的SignSGD(signSGD-MV)是一种通过单比特量化降低通信成本的简单而有效的优化器,但随着对抗工作节点的增加,其收敛速度会显著下降。本文证明,只要对抗工作节点数量少于良性工作节点数量,收敛速度与对抗工作节点数量的增加无关。这一反直觉结果的关键思路在于我们提出的新型带联邦防御的SignSGD(signSGD-FD)。与传统方法不同,signSGD-FD通过梯度符号解码获得适当权重,从而利用对抗工作节点发送的梯度信息。实验结果表明,在各种对抗攻击场景下,signSGD-FD的收敛速度均优于传统算法。