Machine unlearning refers to removing the influence of a specified subset of training data from a machine learning model, efficiently, after it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this work, we study localized unlearning, where the unlearning algorithm operates on a (small) identified subset of parameters. Drawing inspiration from the memorization literature, we propose an improved localization strategy that yields strong results when paired with existing unlearning algorithms. We also propose a new unlearning algorithm, Deletion by Example Localization (DEL), that resets the parameters deemed-to-be most critical according to our localization strategy, and then finetunes them. Our extensive experiments on different datasets, forget sets and metrics reveal that DEL sets a new state-of-the-art for unlearning metrics, against both localized and full-parameter methods, while modifying a small subset of parameters, and outperforms the state-of-the-art localized unlearning in terms of test accuracy too.
翻译:机器遗忘指在机器学习模型完成训练后,高效地消除指定训练数据子集对其产生的影响。这项技术对于关键应用场景具有重要意义,例如通过移除过时、错误标注或污染的数据来提升模型准确性。本研究聚焦于局部遗忘方法,即遗忘算法仅对(小规模)已识别的参数子集进行操作。受记忆机制相关研究的启发,我们提出一种改进的定位策略,该策略与现有遗忘算法结合时能产生显著效果。同时,我们提出一种新型遗忘算法——基于样本定位的删除法(DEL),该算法首先根据我们的定位策略重置被判定为最关键的参数,随后对其进行微调。我们在不同数据集、遗忘集和评估指标上进行的大量实验表明:DEL算法在仅修改少量参数的情况下,在遗忘效果评估指标上超越了现有局部方法与全参数方法,达到了新的最优水平;同时在测试准确率方面也优于当前最先进的局部遗忘方法。