Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.
翻译:在敏感或私有数据上训练的机器学习模型可能会无意中记忆并泄露这些信息。机器遗忘旨在通过从模型权重中消除此类细节来保护隐私。我们提出一种轻量级遗忘算法,利用费舍尔信息矩阵(FIM)实现选择性遗忘。现有相关工作需要完全重新训练或进行大规模矩阵求逆,计算成本高昂。我们的关键洞察在于:FIM对角元素(衡量对数似然对权重变化的敏感度)已包含实现有效遗忘的充分信息。具体而言,我们针对所有可训练权重,在"保留数据"和"遗忘数据"两个子集上分别计算FIM对角元素。这种对角表示在显著降低计算量的同时近似完整FIM。随后,我们利用该表示选择性更新权重,以最大化对敏感子集的遗忘效果,同时最小化对保留子集的影响。实验表明,该算法能在不同神经网络架构下成功遗忘训练数据中任意随机选择的子集。通过利用FIM对角矩阵,我们的方法为机器学习遗忘提供了可解释、轻量且高效的解决方案,具有实际隐私保护价值。