Machine unlearning is studied for a multitude of tasks, but specialization of unlearning methods to particular tasks has made their systematic comparison challenging. To address this issue, we propose a conceptual space to characterize diverse corrupted data unlearning tasks in vision classifiers. This space is described by two dimensions, the discovery rate (the fraction of the corrupted data that are known at unlearning time) and the statistical regularity of the corrupted data (from random exemplars to shared concepts). Methods proposed previously have been targeted at portions of this space and-we show-fail predictably outside these regions. We propose a novel method, Redirection for Erasing Memory (REM), whose key feature is that corrupted data are redirected to dedicated neurons introduced at unlearning time and then discarded or deactivated to suppress the influence of corrupted data. REM performs strongly across the space of tasks, in contrast to prior SOTA methods that fail outside the regions for which they were designed.
翻译:机器遗忘已在众多任务中得到研究,但遗忘方法针对特定任务的专门化使得系统化比较面临挑战。为解决这一问题,我们提出了一个概念空间来刻画视觉分类器中多样化的受损数据遗忘任务。该空间由两个维度描述:发现率(遗忘时已知受损数据的比例)和受损数据的统计规律性(从随机样本到共享概念)。先前提出的方法仅针对该空间的局部区域,而我们证明这些方法在区域外会如预期般失效。我们提出了一种新方法——记忆擦除重定向(REM),其核心特征是在遗忘时将受损数据重定向至专门引入的神经元,随后通过丢弃或停用这些神经元来抑制受损数据的影响。与先前仅在设计区域内有效、在区域外失效的SOTA方法相比,REM在整个任务空间中都表现出强大的性能。