Threats, Attacks, and Defenses in Machine Unlearning: A Survey

Machine Unlearning (MU) has recently gained considerable attention due to its potential to achieve Safe AI by removing the influence of specific data from trained Machine Learning (ML) models. This process, known as knowledge removal, addresses AI governance concerns of training data such as quality, sensitivity, copyright restrictions, and obsolescence. This capability is also crucial for ensuring compliance with privacy regulations such as the Right To Be Forgotten (RTBF). Furthermore, effective knowledge removal mitigates the risk of harmful outcomes, safeguarding against biases, misinformation, and unauthorized data exploitation, thereby enhancing the safe and responsible use of AI systems. Efforts have been made to design efficient unlearning approaches, with MU services being examined for integration with existing machine learning as a service (MLaaS), allowing users to submit requests to remove specific data from the training corpus. However, recent research highlights vulnerabilities in machine unlearning systems, such as information leakage and malicious unlearning, that can lead to significant security and privacy concerns. Moreover, extensive research indicates that unlearning methods and prevalent attacks fulfill diverse roles within MU systems. This underscores the intricate relationship and complex interplay among these mechanisms in maintaining system functionality and safety. This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning and the absence of a comprehensive review that categorizes their taxonomy, methods, and solutions, thus offering valuable insights for future research directions and practical implementations.

翻译：机器学习遗忘（MU）近期因其能够通过从训练好的机器学习（ML）模型中移除特定数据的影响以实现安全AI而受到广泛关注。这一被称为知识移除的过程，解决了训练数据在质量、敏感性、版权限制和过时性等方面的AI治理关切。该能力对于确保符合“被遗忘权”（RTBF）等隐私法规也至关重要。此外，有效的知识移除能够降低有害结果的风险，防范偏见、错误信息和未经授权的数据利用，从而提升AI系统的安全与负责任使用。研究者已致力于设计高效的遗忘方法，MU服务正被探讨与现有机器学习即服务（MLaaS）集成，允许用户提交请求以从训练语料库中移除特定数据。然而，近期研究揭示了机器学习遗忘系统中的脆弱性，例如信息泄露和恶意遗忘，这些可能导致严重的安全与隐私问题。此外，大量研究表明，遗忘方法和主流攻击在MU系统中扮演着多样化的角色。这凸显了这些机制在维持系统功能与安全方面错综复杂的关系和相互作用。本综述旨在填补机器学习遗忘中威胁、攻击与防御的大量研究与其分类体系、方法和解决方案缺乏全面综述之间的空白，从而为未来研究方向与实践应用提供有价值的见解。

相关内容

Machine Learning

关注 2249

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日