Catastrophic Forgetting Resilient One-Shot Incremental Federated Learning

Modern big-data systems generate massive, heterogeneous, and geographically dispersed streams that are large-scale and privacy-sensitive, making centralization challenging. While federated learning (FL) provides a privacy-enhancing training mechanism, it assumes a static data flow and learns a collaborative model over multiple rounds, making learning with \textit{incremental} data challenging in limited-communication scenarios. This paper presents One-Shot Incremental Federated Learning (OSI-FL), the first FL framework that addresses the dual challenges of communication overhead and catastrophic forgetting. OSI-FL communicates category-specific embeddings, devised by a frozen vision-language model (VLM) from each client in a single communication round, which a pre-trained diffusion model at the server uses to synthesize new data similar to the client's data distribution. The synthesized samples are used on the server for training. However, two challenges still persist: i) tasks arriving incrementally need to retrain the global model, and ii) as future tasks arrive, retraining the model introduces catastrophic forgetting. To this end, we augment training with Selective Sample Retention (SSR), which identifies and retains the top-p most informative samples per category and task pair based on sample loss. SSR bounds forgetting by ensuring that representative retained samples are incorporated into training in further iterations. The experimental results indicate that OSI-FL outperforms baselines, including traditional and one-shot FL approaches, in both class-incremental and domain-incremental scenarios across three benchmark datasets.

翻译：现代大数据系统产生大规模、异构且地理分散的数据流，这些数据流规模庞大且具有隐私敏感性，使得集中化处理面临挑战。虽然联邦学习（FL）提供了一种增强隐私的训练机制，但它假设数据流是静态的，并通过多轮学习来训练协作模型，因此在通信受限的场景下，使用\textit{增量}数据进行学习具有挑战性。本文提出了单次增量联邦学习（OSI-FL），这是首个同时应对通信开销和灾难性遗忘双重挑战的FL框架。OSI-FL在单轮通信中传输由各客户端冻结的视觉语言模型（VLM）生成的类别特定嵌入向量，服务器端的预训练扩散模型利用这些嵌入来合成与客户端数据分布相似的新数据。合成的样本在服务器端用于训练。然而，仍存在两个挑战：i) 增量到达的任务需要重新训练全局模型；ii) 随着未来任务的到达，重新训练模型会引发灾难性遗忘。为此，我们通过选择性样本保留（SSR）来增强训练，该方法根据样本损失识别并保留每个类别-任务对中最具信息量的前p个样本。SSR通过确保在后续迭代训练中纳入具有代表性的保留样本来限制遗忘。实验结果表明，在三个基准数据集上的类增量与域增量场景中，OSI-FL均优于基线方法，包括传统FL和单次FL方法。