Reducing communication overhead in federated learning (FL) is challenging but crucial for large-scale distributed privacy-preserving machine learning. While methods utilizing sparsification or others can largely lower the communication overhead, the convergence rate is also greatly compromised. In this paper, we propose a novel method, named single-step synthetic features compressor (3SFC), to achieve communication-efficient FL by directly constructing a tiny synthetic dataset based on raw gradients. Thus, 3SFC can achieve an extremely low compression rate when the constructed dataset contains only one data sample. Moreover, 3SFC's compressing phase utilizes a similarity-based objective function so that it can be optimized with just one step, thereby considerably improving its performance and robustness. In addition, to minimize the compressing error, error feedback (EF) is also incorporated into 3SFC. Experiments on multiple datasets and models suggest that 3SFC owns significantly better convergence rates compared to competing methods with lower compression rates (up to 0.02%). Furthermore, ablation studies and visualizations show that 3SFC can carry more information than competing methods for every communication round, further validating its effectiveness.
翻译:联邦学习(FL)中降低通信开销虽具挑战性,却是实现大规模分布式隐私保护机器学习的关键。现有方法虽可采用稀疏化等技术大幅降低通信开销,却严重妥协了收敛速度。本文提出名为单步合成特征压缩器(3SFC)的新型方法,通过基于原始梯度直接构建微小合成数据集,实现通信高效的联邦学习。当构建数据集仅含单个数据样本时,3SFC可实现极低的压缩比率。更关键的是,3SFC的压缩阶段采用基于相似性的目标函数,使其仅需单步优化即可完成,从而显著提升性能与鲁棒性。同时,为最小化压缩误差,我们将误差反馈机制(EF)整合至3SFC中。多数据集与多模型实验表明,3SFC在较低压缩比率(低至0.02%)下,其收敛速率显著优于对比方法。此外,消融实验与可视化分析证实,3SFC每轮通信可携带比对比方法更丰富的信息,进一步验证了其有效性。