Model heterogeneous federated learning (MHeteroFL) enables FL clients to collaboratively train models with heterogeneous structures in a distributed fashion. However, existing MHeteroFL methods rely on training loss to transfer knowledge between the client model and the server model, resulting in limited knowledge exchange. To address this limitation, we propose the Federated model heterogeneous Matryoshka Representation Learning (FedMRL) approach for supervised learning tasks. It adds an auxiliary small homogeneous model shared by clients with heterogeneous local models. (1) The generalized and personalized representations extracted by the two models' feature extractors are fused by a personalized lightweight representation projector. This step enables representation fusion to adapt to local data distribution. (2) The fused representation is then used to construct Matryoshka representations with multi-dimensional and multi-granular embedded representations learned by the global homogeneous model header and the local heterogeneous model header. This step facilitates multi-perspective representation learning and improves model learning capability. Theoretical analysis shows that FedMRL achieves a $O(1/T)$ non-convex convergence rate. Extensive experiments on benchmark datasets demonstrate its superior model accuracy with low communication and computational costs compared to seven state-of-the-art baselines. It achieves up to 8.48% and 24.94% accuracy improvement compared with the state-of-the-art and the best same-category baseline, respectively.
翻译:模型异构联邦学习(MHeteroFL)使联邦学习客户端能够以分布式方式协作训练具有异构结构的模型。然而,现有的MHeteroFL方法依赖训练损失在客户端模型与服务器模型之间传递知识,导致知识交换有限。为解决这一局限,我们针对监督学习任务提出了联邦模型异构套娃表征学习(FedMRL)方法。该方法引入了一个辅助的小型同构模型,由拥有异构本地模型的客户端共享。(1)通过一个个性化的轻量级表征投影器,将两个模型的特征提取器提取的泛化表征与个性化表征进行融合。此步骤使表征融合能够适应本地数据分布。(2)随后,利用融合后的表征构建套娃表征,该表征包含由全局同构模型头部和本地异构模型头部学习到的多维、多粒度嵌入式表征。此步骤促进了多视角表征学习,并提升了模型的学习能力。理论分析表明,FedMRL实现了$O(1/T)$的非凸收敛速率。在基准数据集上的大量实验证明,与七种先进的基线方法相比,FedMRL以较低的通信和计算成本实现了更优的模型精度。与最先进方法及同类最佳基线相比,其精度分别提升了最高达8.48%和24.94%。