E-3SFC：基于双向特征合成的通信高效联邦学习 (E-3SFC: Communication-Efficient Federated Learning with Double-way Features Synthesizing)

The exponential growth in model sizes has significantly increased the communication burden in Federated Learning (FL). Existing methods to alleviate this burden by transmitting compressed gradients often face high compression errors, which slow down the model's convergence. To simultaneously achieve high compression effectiveness and lower compression errors, we study the gradient compression problem from a novel perspective. Specifically, we propose a systematical algorithm termed Extended Single-Step Synthetic Features Compressing (E-3SFC), which consists of three sub-components, i.e., the Single-Step Synthetic Features Compressor (3SFC), a double-way compression algorithm, and a communication budget scheduler. First, we regard the process of gradient computation of a model as decompressing gradients from corresponding inputs, while the inverse process is considered as compressing the gradients. Based on this, we introduce a novel gradient compression method termed 3SFC, which utilizes the model itself as a decompressor, leveraging training priors such as model weights and objective functions. 3SFC compresses raw gradients into tiny synthetic features in a single-step simulation, incorporating error feedback to minimize overall compression errors. To further reduce communication overhead, 3SFC is extended to E-3SFC, allowing double-way compression and dynamic communication budget scheduling. Our theoretical analysis under both strongly convex and non-convex conditions demonstrates that 3SFC achieves linear and sub-linear convergence rates with aggregation noise. Extensive experiments across six datasets and six models reveal that 3SFC outperforms state-of-the-art methods by up to 13.4% while reducing communication costs by 111.6 times. These findings suggest that 3SFC can significantly enhance communication efficiency in FL without compromising model performance.

翻译：模型规模的指数级增长显著增加了联邦学习中的通信负担。现有通过传输压缩梯度来减轻该负担的方法通常面临高压缩误差，从而减缓模型收敛速度。为同时实现高压缩效率和低压缩误差，本文从一个新颖的视角研究梯度压缩问题。具体而言，我们提出一种系统性算法——扩展单步合成特征压缩，该算法包含三个子组件：单步合成特征压缩器、双向压缩算法和通信预算调度器。首先，我们将模型梯度计算过程视为从相应输入解压缩梯度的过程，而其逆过程则被视为压缩梯度的过程。基于此，我们引入一种新颖的梯度压缩方法，该方法利用模型自身作为解压缩器，并借助模型权重和目标函数等训练先验信息。通过单步模拟将原始梯度压缩为微小的合成特征，并引入误差反馈以最小化整体压缩误差。为进一步降低通信开销，该方法被扩展为支持双向压缩和动态通信预算调度的算法。我们在强凸和非凸条件下的理论分析表明，该方法在存在聚合噪声时实现了线性和次线性收敛速率。在六个数据集和六个模型上的大量实验表明，该方法在降低通信成本111.6倍的同时，性能优于现有最优方法达13.4%。这些发现表明，该方法能在不损害模型性能的前提下，显著提升联邦学习的通信效率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日