$z$-SignFedAvg: A Unified Stochastic Sign-based Compression for Federated Learning

Federated Learning (FL) is a promising privacy-preserving distributed learning paradigm but suffers from high communication cost when training large-scale machine learning models. Sign-based methods, such as SignSGD \cite{bernstein2018signsgd}, have been proposed as a biased gradient compression technique for reducing the communication cost. However, sign-based algorithms could diverge under heterogeneous data, which thus motivated the development of advanced techniques, such as the error-feedback method and stochastic sign-based compression, to fix this issue. Nevertheless, these methods still suffer from slower convergence rates. Besides, none of them allows multiple local SGD updates like FedAvg \cite{mcmahan2017communication}. In this paper, we propose a novel noisy perturbation scheme with a general symmetric noise distribution for sign-based compression, which not only allows one to flexibly control the tradeoff between gradient bias and convergence performance, but also provides a unified viewpoint to existing stochastic sign-based methods. More importantly, the unified noisy perturbation scheme enables the development of the very first sign-based FedAvg algorithm ($z$-SignFedAvg) to accelerate the convergence. Theoretically, we show that $z$-SignFedAvg achieves a faster convergence rate than existing sign-based methods and, under the uniformly distributed noise, can enjoy the same convergence rate as its uncompressed counterpart. Extensive experiments are conducted to demonstrate that the $z$-SignFedAvg can achieve competitive empirical performance on real datasets and outperforms existing schemes.

翻译：联邦学习（FL）是一种有前景的隐私保护分布式学习范式，但在训练大规模机器学习模型时面临通信成本高昂的问题。符号类方法（如SignSGD \cite{bernstein2018signsgd}）已被提出作为一种有偏见的梯度压缩技术以降低通信成本。然而，符号类算法可能在异构数据下发生发散，这促使了先进技术的开发，例如误差反馈方法和随机符号压缩，以解决这一问题。尽管如此，这些方法仍存在收敛速度较慢的问题。此外，这些方法都不允许像FedAvg \cite{mcmahan2017communication} 那样进行多次本地SGD更新。在本文中，我们提出了一种新颖的噪声扰动方案，采用一般对称噪声分布进行符号压缩，该方案不仅允许灵活控制梯度偏差与收敛性能之间的权衡，而且提供了现有随机符号方法的统一视角。更重要的是，这种统一的噪声扰动方案促进了首个基于符号的FedAvg算法（$z$-SignFedAvg）的发展，以加速收敛。理论上，我们证明$z$-SignFedAvg实现了比现有符号方法更快的收敛速度，并且在均匀分布的噪声下，可以享受与其未压缩版本相同的收敛速度。大量实验表明，$z$-SignFedAvg在真实数据集上能够达到有竞争力的经验性能，并且优于现有方案。