Machine unlearning, the ability for a machine learning model to forget, is becoming increasingly important to comply with data privacy regulations, as well as to remove harmful, manipulated, or outdated information. The key challenge lies in forgetting specific information while protecting model performance on the remaining data. While current state-of-the-art methods perform well, they typically require some level of retraining over the retained data, in order to protect or restore model performance. This adds computational overhead and mandates that the training data remain available and accessible, which may not be feasible. In contrast, other methods employ a retrain-free paradigm, however, these approaches are prohibitively computationally expensive and do not perform on par with their retrain-based counterparts. We present Selective Synaptic Dampening (SSD), a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data. First, SSD uses the Fisher information matrix of the training and forgetting data to select parameters that are disproportionately important to the forget set. Second, SSD induces forgetting by dampening these parameters proportional to their relative importance to the forget set with respect to the wider training data. We evaluate our method against several existing unlearning methods in a range of experiments using ResNet18 and Vision Transformer. Results show that the performance of SSD is competitive with retrain-based post hoc methods, demonstrating the viability of retrain-free post hoc unlearning approaches.
翻译:机器遗忘,即机器学习模型遗忘的能力,正日益重要,以满足数据隐私法规要求,并清除有害、被操纵或过时的信息。关键挑战在于在遗忘特定信息的同时保护模型对剩余数据的性能。尽管当前最先进方法表现良好,但它们通常需要对保留数据进行一定程度的重训练,以保护或恢复模型性能。这增加了计算开销,并要求训练数据保持可用和可访问,而这可能不可行。相比之下,其他方法采用无重训练范式,但这些方法计算成本高昂,且性能不及基于重训练的方法。我们提出选择性突触抑制(SSD),一种新颖的两步后处理无重训练机器遗忘方法,该方法快速、高效,且无需长期存储训练数据。首先,SSD利用训练数据和遗忘数据的Fisher信息矩阵,选择对遗忘集具有不成比例重要性的参数。其次,SSD通过按这些参数对遗忘集相对于更广泛训练数据的重要性比例进行抑制来诱导遗忘。我们使用ResNet18和Vision Transformer在多项实验中评估了该方法与几种现有遗忘方法的性能。结果表明,SSD的性能与基于重训练的后处理方法相当,证明了无重训练后处理遗忘方法的可行性。