Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.
翻译:机器学习遗忘旨在从已训练模型中移除特定数据点,通常力求实现"完美重训练",即生成若从未包含被删除数据时本应获得的模型。我们证明该方法及其所依赖的安全定义对剩余(未删除)数据点存在重大隐私风险。我们提出一种重构攻击,表明对于某些本可通过安全计算(无需删除操作)完成的任务,遵循完美重训练机制的方案仅需攻击者控制$ω(1)$个数据点,通过发起删除请求即可重构几乎整个数据集。我们系统考察现有机器学习遗忘定义,发现它们要么易受此类攻击影响,要么限制过于严格而无法支持精确求和等基本功能。为解决该问题,我们提出新的安全定义,专门保护未删除数据免受其他数据点删除导致的泄露风险。我们证明该定义支持公告板、求和运算及统计学习等多项核心功能。