Federated learning (FL) facilitates edge devices to cooperatively train a global shared model while maintaining the training data locally and privately. However, a common but impractical assumption in FL is that the participating edge devices possess the same required resources and share identical global model architecture. In this study, we propose a novel FL method called Federated Intermediate Layers Learning (FedIN), supporting heterogeneous models without utilizing any public dataset. The training models in FedIN are divided into three parts, including an extractor, the intermediate layers, and a classifier. The model architectures of the extractor and classifier are the same in all devices to maintain the consistency of the intermediate layer features, while the architectures of the intermediate layers can vary for heterogeneous devices according to their resource capacities. To exploit the knowledge from features, we propose IN training, training the intermediate layers in line with the features from other clients. Additionally, we formulate and solve a convex optimization problem to mitigate the gradient divergence problem induced by the conflicts between the IN training and the local training. The experiment results show that FedIN achieves the best performance in the heterogeneous model environment compared with the state-of-the-art algorithms. Furthermore, our ablation study demonstrates the effectiveness of IN training and the solution to the convex optimization problem.
翻译:联邦学习(Federated Learning, FL)使边缘设备能够在本地私有保存训练数据的同时,协作训练全局共享模型。然而,FL中一个常见但不切实际的假设是:参与训练的边缘设备拥有相同的必要资源,并共享完全一致的全局模型架构。本研究提出一种名为联邦中间层学习(Federated Intermediate Layers Learning, FedIN)的新型FL方法,该方法无需使用任何公共数据集即可支持异构模型。FedIN中的训练模型被划分为三个部分:特征提取器、中间层和分类器。所有设备中特征提取器和分类器的模型架构保持一致,以确保中间层特征的统一性;而中间层的架构可根据异构设备的资源能力进行差异设计。为充分利用特征中的知识,我们提出IN训练方法,即根据其他客户端的特征来训练中间层。此外,我们构建并求解一个凸优化问题,以缓解IN训练与本地训练之间冲突所导致的梯度发散问题。实验结果表明,与当前最先进的算法相比,FedIN在异构模型环境中取得了最优性能。同时,消融研究验证了IN训练及凸优化问题解决方案的有效性。