Federated Model Heterogeneous Matryoshka Representation Learning

Model heterogeneous federated learning (MHeteroFL) enables FL clients to collaboratively train models with heterogeneous structures in a distributed fashion. However, existing MHeteroFL methods rely on training loss to transfer knowledge between the client model and the server model, resulting in limited knowledge exchange. To address this limitation, we propose the Federated model heterogeneous Matryoshka Representation Learning (FedMRL) approach for supervised learning tasks. It adds an auxiliary small homogeneous model shared by clients with heterogeneous local models. (1) The generalized and personalized representations extracted by the two models' feature extractors are fused by a personalized lightweight representation projector. This step enables representation fusion to adapt to local data distribution. (2) The fused representation is then used to construct Matryoshka representations with multi-dimensional and multi-granular embedded representations learned by the global homogeneous model header and the local heterogeneous model header. This step facilitates multi-perspective representation learning and improves model learning capability. Theoretical analysis shows that FedMRL achieves a $O(1/T)$ non-convex convergence rate. Extensive experiments on benchmark datasets demonstrate its superior model accuracy with low communication and computational costs compared to seven state-of-the-art baselines. It achieves up to 8.48% and 24.94% accuracy improvement compared with the state-of-the-art and the best same-category baseline, respectively.

翻译：模型异构联邦学习（MHeteroFL）使联邦学习客户端能够以分布式方式协作训练具有异构结构的模型。然而，现有的MHeteroFL方法依赖训练损失在客户端模型与服务器模型之间传递知识，导致知识交换有限。为解决这一局限，我们针对监督学习任务提出了联邦模型异构套娃表征学习（FedMRL）方法。该方法引入了一个辅助的小型同构模型，由拥有异构本地模型的客户端共享。（1）通过一个个性化的轻量级表征投影器，将两个模型的特征提取器提取的泛化表征与个性化表征进行融合。此步骤使表征融合能够适应本地数据分布。（2）随后，利用融合后的表征构建套娃表征，该表征包含由全局同构模型头部和本地异构模型头部学习到的多维、多粒度嵌入式表征。此步骤促进了多视角表征学习，并提升了模型的学习能力。理论分析表明，FedMRL实现了$O(1/T)$的非凸收敛速率。在基准数据集上的大量实验证明，与七种先进的基线方法相比，FedMRL以较低的通信和计算成本实现了更优的模型精度。与最先进方法及同类最佳基线相比，其精度分别提升了最高达8.48%和24.94%。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日