DriftGuard: Mitigating Asynchronous Data Drift in Federated Learning

In real-world Federated Learning (FL) deployments, data distributions on devices that participate in training evolve over time. This leads to asynchronous data drift, where different devices shift at different times and toward different distributions. Mitigating such drift is challenging: frequent retraining incurs high computational cost on resource-constrained devices, while infrequent retraining degrades performance on drifting devices. We propose DriftGuard, a federated continual learning framework that efficiently adapts to asynchronous data drift. DriftGuard adopts a Mixture-of-Experts (MoE) inspired architecture that separates shared parameters, which capture globally transferable knowledge, from local parameters that adapt to group-specific distributions. This design enables two complementary retraining strategies: (i) global retraining, which updates the shared parameters when system-wide drift is identified, and (ii) group retraining, which selectively updates local parameters for clusters of devices identified via MoE gating patterns, without sharing raw data. Experiments across multiple datasets and models show that DriftGuard matches or exceeds state-of-the-art accuracy while reducing total retraining cost by up to 83%. As a result, it achieves the highest accuracy per unit retraining cost, improving over the strongest baseline by up to 2.3x. DriftGuard is available for download from https://github.com/blessonvar/DriftGuard.

翻译：在实际的联邦学习（FL）部署中，参与训练的设备上的数据分布会随时间演变，导致异步数据漂移——不同设备在不同时间点发生漂移，且漂移方向各异。缓解此类漂移极具挑战性：频繁重训练会给资源受限设备带来高昂计算成本，而减少重训练频率则会降低漂移设备的性能。我们提出DriftGuard——一种能够有效适应异步数据漂移的联邦持续学习框架。该框架采用基于混合专家（MoE）的架构，将捕获全局可迁移知识的共享参数与适应特定群体分布的局部参数分离。这一设计支持两种互补的重训练策略：（i）全局重训练——在检测到系统级漂移时更新共享参数；（ii）分组重训练——根据MoE门控模式识别设备集群，在不共享原始数据的前提下选择性更新局部参数。在多个数据集和模型上的实验表明，DriftGuard在匹配或超越现有最优精度的同时，可将总重训练成本降低高达83%。因此，它实现了单位重训练成本下的最高精度，相比最强基线方法提升高达2.3倍。DriftGuard可从https://github.com/blessonvar/DriftGuard获取。

相关内容

联邦学习

关注 200

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。

《联邦学习在网络安全中的应用：性能、鲁棒性与对抗性威胁》2025最新145页

专知会员服务

20+阅读 · 2025年9月18日

【CMU博士论文】异构网络中可扩展且值得信赖的学习方法，147页pdf

专知会员服务

25+阅读 · 2023年8月27日