Federated learning (FL) in heterogeneous environments remains challenging because client models often differ in both architecture and data distribution. While recent approaches attempt to address this challenge through client clustering and knowledge distillation, simultaneously handling architectural and statistical heterogeneity remains difficult. We introduce COSMOS, a model-agnostic framework that enables server-side personalization using only pseudo-label communication. Clients train local models and predict on the public data; the server clusters clients by prediction similarity, trains a cluster-specific model for each group using its own compute, and distills the resulting models back to clients. We provide the first theoretical analysis showing that distillation from the learned cluster models can yield exponential personalization risk contraction, going beyond the convergence-to-stationarity guarantees typically provided in model-agnostic FL. Experiments across benchmarks demonstrate that COSMOS consistently outperforms all model-agnostic FL baselines while remaining competitive with state-of-the-art personalized FL methods. More broadly, our results highlight personalized server-side learning with pseudo-labels as a promising paradigm for scalable and model-agnostic federated learning in highly heterogeneous environments.
翻译:异构环境中的联邦学习仍具挑战性,因为客户端模型在架构和数据分布上常存在差异。尽管近期方法通过客户端聚类和知识蒸馏试图解决该问题,但同时处理架构异构性与统计异构性仍具难度。我们提出COSMOS——一个仅依赖伪标签通信实现服务器端个性化的模型无关框架。客户端训练本地模型并在公共数据上进行预测;服务器根据预测相似性对客户端聚类,利用自身算力为每个聚类训练特定模型,并将生成的模型蒸馏回客户端。我们首次提供理论分析,证明从学习到的聚类模型进行蒸馏可实现指数级个性化风险收缩,超越了模型无关联邦学习中常见的收敛至驻点保证。跨基准实验表明,COSMOS始终优于所有模型无关联邦学习基线,且与最先进的个性化联邦学习方法性能相当。更广泛而言,我们的结果揭示基于伪标签的个性化服务器端学习,是高度异构环境中实现可扩展且模型无关联邦学习的具有前景的范式。