Machine unlearning, a process enabling pre-trained models to remove the influence of specific training samples, has attracted significant attention in recent years. While extensive research has focused on developing efficient unlearning strategies, the critical aspect of unlearning verification has been largely overlooked. Existing verification methods mainly rely on machine learning attack techniques, such as membership inference attacks (MIAs) or backdoor attacks. However, these methods, not being formally designed for verification purposes, exhibit limitations in robustness and only support a small, predefined subset of samples. Moreover, dependence on prepared sample-level modifications of MIAs or backdoor attacks restricts their applicability in Machine Learning as a Service (MLaaS) environments. To address these limitations, we propose a novel robustness verification scheme without any prior modifications, and can support verification on a much larger set. Our scheme employs an optimization-based method to recover the actual training samples from the model. By comparative analysis of recovered samples extracted pre- and post-unlearning, MLaaS users can verify the unlearning process. This verification scheme, operating exclusively through model parameters, avoids the need for any sample-level modifications prior to model training while supporting verification on a much larger set and maintaining robustness. The effectiveness of our proposed approach is demonstrated through theoretical analysis and experiments involving diverse models on various datasets in different scenarios.
翻译:机器学习遗忘作为一种使预训练模型能够消除特定训练样本影响的过程,近年来引起了广泛关注。尽管大量研究致力于开发高效的遗忘策略,但遗忘验证这一关键环节在很大程度上被忽视。现有验证方法主要依赖于机器学习攻击技术,例如成员推理攻击(MIAs)或后门攻击。然而,这些方法并非为验证目的而正式设计,在鲁棒性方面存在局限,且仅支持少量预定义的样本子集。此外,对MIA或后门攻击所需样本级修改的依赖限制了其在机器学习即服务(MLaaS)环境中的适用性。为克服这些限制,我们提出了一种无需任何先验修改的新型鲁棒性验证方案,能够支持更大规模集合的验证。该方案采用基于优化的方法从模型中恢复实际训练样本。通过对比分析遗忘前后提取的恢复样本,MLaaS用户可以验证遗忘过程。此验证方案仅通过模型参数进行操作,避免了模型训练前任何样本级修改的需求,同时支持更大规模集合的验证并保持鲁棒性。我们通过理论分析及在不同场景下对多种数据集和模型进行的实验,证明了所提方法的有效性。