Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.
翻译:基于敏感或私人数据训练的机器学习模型可能无意中记忆并泄露这些信息。机器遗忘旨在从模型权重中追溯移除此类细节以保护隐私。我们提出了一种轻量级遗忘算法,利用Fisher信息矩阵(FIM)实现选择性遗忘。先前该领域的工作需要完全重新训练或进行大规模矩阵求逆,计算成本高昂。我们的关键洞察在于,FIM的对角元素(衡量对数似然对权重变化的敏感性)包含了进行有效遗忘的足够信息。具体而言,我们对所有可训练权重计算两个子集(需保留和遗忘的数据)上的FIM对角线。这种对角线表示近似于完整的FIM,同时大幅减少计算量。随后,我们利用它选择性更新权重,以最大化对敏感子集的遗忘,同时最小化对保留子集的影响。实验表明,我们的算法能在多种神经网络架构上成功遗忘训练数据中任意随机选择的子集。通过利用FIM对角线,我们的方法为机器遗忘提供了一种可解释、轻量且高效的解决方案,具有实际的隐私效益。