Recently, serious concerns have been raised about the privacy issues related to training datasets in machine learning algorithms when including personal data. Various regulations in different countries, including the GDPR grant individuals to have personal data erased, known as 'the right to be forgotten' or 'the right to erasure'. However, there has been less research on effectively and practically deleting the requested personal data from the training set while not jeopardizing the overall machine learning performance. In this work, we propose a fast and novel machine unlearning paradigm at the layer level called layer attack unlearning, which is highly accurate and fast compared to existing machine unlearning algorithms. We introduce the Partial-PGD algorithm to locate the samples to forget efficiently. In addition, we only use the last layer of the model inspired by the Forward-Forward algorithm for unlearning process. Lastly, we use Knowledge Distillation (KD) to reliably learn the decision boundaries from the teacher using soft label information to improve accuracy performance. We conducted extensive experiments with SOTA machine unlearning models and demonstrated the effectiveness of our approach for accuracy and end-to-end unlearning performance.
翻译:近期,机器学习算法中涉及个人数据的训练数据集隐私问题引发了严重关切。包括通用数据保护条例(GDPR)在内的各国法规赋予个人删除个人数据的权利,即"被遗忘权"或"删除权"。然而,在有效且实际地从训练集中删除特定个人数据的同时不损害机器学习整体性能方面的研究仍然较少。本文提出一种新颖且快速的层级级机器遗忘范式——层攻击遗忘,与现有机器遗忘算法相比,该方法具有高精确度和快速性。我们引入Partial-PGD算法高效定位需要遗忘的样本。此外,受正向-正向算法启发,我们仅使用模型最后一层进行遗忘处理。最后,我们采用知识蒸馏(KD)技术,利用软标签信息从教师模型中可靠地学习决策边界,以提升准确率性能。我们基于当前最优(SOTA)的机器遗忘模型开展了大量实验,验证了本方法在准确率及端到端遗忘性能方面的有效性。