Machine Unlearning (MU) has recently gained considerable attention due to its potential to achieve Safe AI by removing the influence of specific data from trained Machine Learning (ML) models. This process, known as knowledge removal, addresses AI governance concerns of training data such as quality, sensitivity, copyright restrictions, and obsolescence. This capability is also crucial for ensuring compliance with privacy regulations such as the Right To Be Forgotten (RTBF). Furthermore, effective knowledge removal mitigates the risk of harmful outcomes, safeguarding against biases, misinformation, and unauthorized data exploitation, thereby enhancing the safe and responsible use of AI systems. Efforts have been made to design efficient unlearning approaches, with MU services being examined for integration with existing machine learning as a service (MLaaS), allowing users to submit requests to remove specific data from the training corpus. However, recent research highlights vulnerabilities in machine unlearning systems, such as information leakage and malicious unlearning, that can lead to significant security and privacy concerns. Moreover, extensive research indicates that unlearning methods and prevalent attacks fulfill diverse roles within MU systems. This underscores the intricate relationship and complex interplay among these mechanisms in maintaining system functionality and safety. This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning and the absence of a comprehensive review that categorizes their taxonomy, methods, and solutions, thus offering valuable insights for future research directions and practical implementations.
翻译:机器学习遗忘(MU)近期因其能够通过从训练好的机器学习(ML)模型中移除特定数据的影响以实现安全AI而受到广泛关注。这一被称为知识移除的过程,解决了训练数据在质量、敏感性、版权限制和过时性等方面的AI治理关切。该能力对于确保符合“被遗忘权”(RTBF)等隐私法规也至关重要。此外,有效的知识移除能够降低有害结果的风险,防范偏见、错误信息和未经授权的数据利用,从而提升AI系统的安全与负责任使用。研究者已致力于设计高效的遗忘方法,MU服务正被探讨与现有机器学习即服务(MLaaS)集成,允许用户提交请求以从训练语料库中移除特定数据。然而,近期研究揭示了机器学习遗忘系统中的脆弱性,例如信息泄露和恶意遗忘,这些可能导致严重的安全与隐私问题。此外,大量研究表明,遗忘方法和主流攻击在MU系统中扮演着多样化的角色。这凸显了这些机制在维持系统功能与安全方面错综复杂的关系和相互作用。本综述旨在填补机器学习遗忘中威胁、攻击与防御的大量研究与其分类体系、方法和解决方案缺乏全面综述之间的空白,从而为未来研究方向与实践应用提供有价值的见解。