Machine Unlearning aims to remove specific data from trained models, addressing growing privacy and ethical concerns. We provide a theoretical analysis of a simple and widely used method - gradient ascent - used to reverse the influence of a specific data point without retraining from scratch. Leveraging the implicit bias of gradient descent towards solutions that satisfy the Karush-Kuhn-Tucker (KKT) conditions of a margin maximization problem, we quantify the quality of the unlearned model by evaluating how well it satisfies these conditions w.r.t. the retained data. To formalize this idea, we propose a new success criterion, termed \textbf{$(\epsilon, \delta, \tau)$-successful} unlearning, and show that, for both linear models and two-layer neural networks with high dimensional data, a properly scaled gradient-ascent step satisfies this criterion and yields a model that closely approximates the retrained solution on the retained data. We also show that gradient ascent performs successful unlearning while still preserving generalization in a synthetic Gaussian-mixture setting.
翻译:机器遗忘旨在从已训练模型中移除特定数据,以应对日益增长的隐私与伦理关切。本文对一种简单且广泛使用的方法——梯度上升——进行了理论分析,该方法用于逆转特定数据点的影响而无需从头重新训练。借助梯度下降向满足间隔最大化问题Karush-Kuhn-Tucker(KKT)条件解的隐式偏置,我们通过评估遗忘模型在保留数据上满足这些条件的程度来量化其质量。为形式化这一思想,我们提出了名为\textbf{$(\epsilon, \delta, \tau)$-成功遗忘}的新成功准则,并证明对于线性模型和具有高维数据的双层神经网络,经适当缩放的梯度上升步骤均满足该准则,且能在保留数据上产生与重新训练解高度近似的模型。我们还证明在合成高斯混合场景中,梯度上升在实现成功遗忘的同时仍能保持泛化能力。