Federated Learning (FL) is a decentralized machine learning paradigm that enables clients to collaboratively train models while preserving data privacy. However, the coexistence of model and data heterogeneity gives rise to inconsistent representations and divergent optimization dynamics across clients, ultimately hindering robust global performance. To transcend these challenges, we propose Mosaic, a novel data-free knowledge distillation framework tailored for heterogeneous distributed environments. Mosaic first trains local generative models to approximate each client's personalized distribution, enabling synthetic data generation that safeguards privacy through strict separation from real data. Subsequently, Mosaic forms a Mixture-of-Experts (MoE) from client models based on their specialized knowledge, and distills it into a global model using the generated data. To further enhance the MoE architecture, Mosaic integrates expert predictions via a lightweight meta model trained on a few representative prototypes. Extensive experiments on standard image and multimodal benchmarks demonstrate that Mosaic consistently outperforms state-of-the-art approaches under both model and data heterogeneity. The source code has been published at https://github.com/Wings-Of-Disaster/Mosaic.
翻译:联邦学习(FL)是一种去中心化的机器学习范式,使客户端能够在保护数据隐私的同时协作训练模型。然而,模型异构性与数据异构性的共存导致了客户端间表示不一致和优化动态发散,最终阻碍了全局性能的稳健性。为应对这些挑战,我们提出了Mosaic——一种专为异构分布式环境设计的新型无数据知识蒸馏框架。Mosaic首先训练本地生成模型以近似每个客户端的个性化分布,通过严格分离真实数据生成合成数据以保障隐私。随后,Mosaic基于客户端模型的专长知识构建专家混合模型(MoE),并利用生成数据将其蒸馏至全局模型。为进一步增强MoE架构,Mosaic通过轻量级元模型集成专家预测,该元模型基于少量代表性原型进行训练。在标准图像和多模态基准上的大量实验表明,无论面对模型异构性还是数据异构性,Mosaic均持续优于现有最优方法。源代码已发布于https://github.com/Wings-Of-Disaster/Mosaic。