Reducing communication overhead in federated learning (FL) is challenging but crucial for large-scale distributed privacy-preserving machine learning. While methods utilizing sparsification or others can largely lower the communication overhead, the convergence rate is also greatly compromised. In this paper, we propose a novel method, named single-step synthetic features compressor (3SFC), to achieve communication-efficient FL by directly constructing a tiny synthetic dataset based on raw gradients. Thus, 3SFC can achieve an extremely low compression rate when the constructed dataset contains only one data sample. Moreover, 3SFC's compressing phase utilizes a similarity-based objective function so that it can be optimized with just one step, thereby considerably improving its performance and robustness. In addition, to minimize the compressing error, error feedback (EF) is also incorporated into 3SFC. Experiments on multiple datasets and models suggest that 3SFC owns significantly better convergence rates compared to competing methods with lower compression rates (up to 0.02%). Furthermore, ablation studies and visualizations show that 3SFC can carry more information than competing methods for every communication round, further validating its effectiveness.
翻译:在联邦学习中降低通信开销虽具挑战性,但对大规模分布式隐私保护机器学习至关重要。现有稀疏化等方法虽能大幅降低通信开销,却严重削弱了收敛速度。本文提出一种名为单步合成特征压缩器的新方法,通过直接基于原始梯度构建微型合成数据集实现通信高效联邦学习。当构建数据集仅含单个数据样本时,3SFC可实现极低压缩率。此外,3SFC压缩阶段采用基于相似性的目标函数,仅需单步优化即可显著提升性能与鲁棒性。为最小化压缩误差,该方法还融入误差反馈机制。在多个数据集与模型上的实验表明,相比同类方法,3SFC在更低压缩率(低至0.02%)下仍拥有显著更优的收敛速度。消融实验与可视化分析进一步证实,3SFC在每个通信轮次中能携带更多信息,验证了其有效性。