Federated learning is a promising privacy-preserving paradigm for distributed machine learning. In this context, there is sometimes a need for a specialized process called machine unlearning, which is required when the effect of some specific training samples needs to be removed from a learning model due to privacy, security, usability, and/or legislative factors. However, problems arise when current centralized unlearning methods are applied to existing federated learning, in which the server aims to remove all information about a class from the global model. Centralized unlearning usually focuses on simple models or is premised on the ability to access all training data at a central node. However, training data cannot be accessed on the server under the federated learning paradigm, conflicting with the requirements of the centralized unlearning process. Additionally, there are high computation and communication costs associated with accessing clients' data, especially in scenarios involving numerous clients or complex global models. To address these concerns, we propose a more effective and efficient federated unlearning scheme based on the concept of model explanation. Model explanation involves understanding deep networks and individual channel importance, so that this understanding can be used to determine which model channels are critical for classes that need to be unlearned. We select the most influential channels within an already-trained model for the data that need to be unlearned and fine-tune only influential channels to remove the contribution made by those data. In this way, we can simultaneously avoid huge consumption costs and ensure that the unlearned model maintains good performance. Experiments with different training models on various datasets demonstrate the effectiveness of the proposed approach.
翻译:联邦学习是一种有前景的分布式机器学习隐私保护范式。在此背景下,有时需要一种称为机器遗忘的专门过程,当由于隐私、安全、可用性和/或法律因素需要从学习模型中移除某些特定训练样本的影响时,这一过程就成为必要。然而,当将当前集中式遗忘方法应用于现有联邦学习时会出现问题,其中服务器旨在从全局模型中移除关于某个类别的所有信息。集中式遗忘通常侧重于简单模型,或者以能够访问中心节点所有训练数据为前提。然而,在联邦学习范式下,服务器无法访问训练数据,这与集中式遗忘过程的要求相冲突。此外,访问客户端数据会带来高昂的计算和通信成本,尤其是在涉及众多客户端或复杂全局模型的场景中。为解决这些问题,我们提出了一种基于模型解释概念、更高效且有效的联邦遗忘方案。模型解释涉及理解深度网络和个体通道重要性,从而利用这种理解来确定哪些模型通道对于需要被遗忘的类别至关重要。我们在已训练模型中针对需要被遗忘的数据选择最具影响力的通道,并仅对这些关键通道进行微调,以消除这些数据所做的贡献。通过这种方式,我们能够同时避免巨大的消耗成本,并确保被遗忘模型保持良好的性能。在不同数据集上使用多种训练模型进行的实验证明了所提方法的有效性。