With the increasing importance of data privacy and security, federated unlearning emerges as a new research field dedicated to ensuring that once specific data is deleted, federated learning models no longer retain or disclose related information. In this paper, we propose a zero-shot federated unlearning scheme, named Jellyfish. It distinguishes itself from conventional federated unlearning frameworks in four key aspects: synthetic data generation, knowledge disentanglement, loss function design, and model repair. To preserve the privacy of forgotten data, we design a zero-shot unlearning mechanism that generates error-minimization noise as proxy data for the data to be forgotten. To maintain model utility, we first propose a knowledge disentanglement mechanism that regularises the output of the final convolutional layer by restricting the number of activated channels for the data to be forgotten and encouraging activation sparsity. Next, we construct a comprehensive loss function that incorporates multiple components, including hard loss, confusion loss, distillation loss, model weight drift loss, gradient harmonization, and gradient masking, to effectively align the learning trajectories of the objectives of ``forgetting" and ``retaining". Finally, we propose a zero-shot repair mechanism that leverages proxy data to restore model accuracy within acceptable bounds without accessing users' local data. To evaluate the performance of the proposed zero-shot federated unlearning scheme, we conducted comprehensive experiments across diverse settings. The results validate the effectiveness and robustness of the scheme.
翻译:随着数据隐私与安全的重要性日益提升,联邦遗忘作为一种新兴研究方向应运而生,旨在确保特定数据被删除后,联邦学习模型不再保留或泄露相关信息。本文提出一种名为Jellyfish的零样本联邦遗忘方案,该方案在合成数据生成、知识解耦、损失函数设计和模型修复四个关键方面与现有联邦遗忘框架存在显著差异。为保护被遗忘数据的隐私,我们设计了一种零样本遗忘机制,通过生成误差最小化噪声作为待遗忘数据的代理数据。为维持模型效用,我们首先提出知识解耦机制,通过限制待遗忘数据对应卷积层末端的激活通道数量并鼓励激活稀疏性来对输出进行正则化处理;继而构建包含硬损失、混淆损失、蒸馏损失、模型权重漂移损失、梯度协调与梯度掩码等多元成分的综合损失函数,以有效对齐"遗忘"与"保留"目标的优化轨迹。最后提出零样本修复机制,在无需访问用户本地数据的前提下,利用代理数据将模型精度恢复至可接受范围。为评估所提零样本联邦遗忘方案的性能,我们在多样化场景下开展了系统实验,结果验证了该方案的有效性与鲁棒性。