Communication overhead is one of the major challenges in Federated Learning(FL). A few classical schemes assume the server can extract the auxiliary information about training data of the participants from the local models to construct a central dummy dataset. The server uses the dummy dataset to finetune aggregated global model to achieve the target test accuracy in fewer communication rounds. In this paper, we summarize the above solutions into a data-based communication-efficient FL framework. The key of the proposed framework is to design an efficient extraction module(EM) which ensures the dummy dataset has a positive effect on finetuning aggregated global model. Different from the existing methods that use generator to design EM, our proposed method, FedINIBoost borrows the idea of gradient match to construct EM. Specifically, FedINIBoost builds a proxy dataset of the real dataset in two steps for each participant at each communication round. Then the server aggregates all the proxy datasets to form a central dummy dataset, which is used to finetune aggregated global model. Extensive experiments verify the superiority of our method compared with the existing classical method, FedAVG, FedProx, Moon and FedFTG. Moreover, FedINIBoost plays a significant role in finetuning the performance of aggregated global model at the initial stage of FL.
翻译:通信开销是联邦学习(FL)面临的主要挑战之一。部分经典方案假设服务器能够从参与者的本地模型中提取训练数据的辅助信息,从而构建中央虚拟数据集。服务器利用该虚拟数据集对聚合后的全局模型进行微调,以在更少的通信轮次内达到目标测试精度。本文将这些方案归纳为一种基于数据的通信高效联邦学习框架。该框架的关键在于设计高效的提取模块(EM),确保虚拟数据集对聚合全局模型的微调产生正向效果。与现有方法采用生成器设计EM不同,我们提出的FedINIBoost方法借鉴了梯度匹配思想来构建EM。具体而言,FedINIBoost在每个通信轮次为每位参与者分两步构建真实数据集的代理数据集,随后服务器聚合所有代理数据集形成中央虚拟数据集,用于微调聚合后的全局模型。大量实验验证了我们的方法相较于现有经典方法FedAVG、FedProx、Moon和FedFTG的优越性。此外,FedINIBoost在联邦学习初始阶段对聚合全局模型的性能微调具有显著作用。