Federated Learning (FL) is a distributed training paradigm that enables clients scattered across the world to cooperatively learn a global model without divulging confidential data. However, FL faces a significant challenge in the form of heterogeneous data distributions among clients, which leads to a reduction in performance and robustness. A recent approach to mitigating the impact of heterogeneous data distributions is through the use of foundation models, which offer better performance at the cost of larger computational overheads and slower inference speeds. We introduce foundation model distillation to assist in the federated training of lightweight client models and increase their performance under heterogeneous data settings while keeping inference costs low. Our results show improvement in the global model performance on a balanced testing set, which contains rarely observed samples, even under extreme non-IID client data distributions. We conduct a thorough evaluation of our framework with different foundation model backbones on CIFAR10, with varying degrees of heterogeneous data distributions ranging from class-specific data partitions across clients to dirichlet data sampling, parameterized by values between 0.01 and 1.0.
翻译:联邦学习(FL)是一种分布式训练范式,使分布在世界各地的客户端能够在不泄露机密数据的情况下协同学习全局模型。然而,FL面临客户端间数据异构分布这一重大挑战,这会导致性能与鲁棒性下降。近期一种缓解数据异构分布影响的方法是通过使用基础模型——这类模型以更大的计算开销和更慢的推理速度为代价提供更优性能。我们提出基础模型蒸馏技术,以辅助轻量级客户端模型的联邦训练,在保持低推理成本的同时提升其在异构数据设置下的性能。实验结果表明,即使在客户端数据分布极端非独立同分布(non-IID)的情况下,我们的方法在包含罕见样本的平衡测试集上仍能提升全局模型的性能。我们在CIFAR10数据集上使用不同基础模型骨干网络进行全面评估,涵盖从跨客户端的类别专属数据划分到Dirichlet数据采样等多种异构数据分布程度,参数取值范围为0.01至1.0。