Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data. Recent methods suggest that one approach of data forgetting is by precomputing and storing statistics carrying second-order information to improve computational and memory efficiency. However, they rely on restrictive assumptions and the computation/storage suffer from the curse of model parameter dimensionality, making it challenging to apply to most deep neural networks. In this work, we propose a Hessian-free online unlearning method. We propose to maintain a statistical vector for each data point, computed through affine stochastic recursion approximation of the difference between retrained and learned models. Our proposed algorithm achieves near-instantaneous online unlearning as it only requires a vector addition operation. Based on the strategy that recollecting statistics for forgetting data, the proposed method significantly reduces the unlearning runtime. Experimental studies demonstrate that the proposed scheme surpasses existing results by orders of magnitude in terms of time and memory costs, while also enhancing accuracy.
翻译:机器遗忘旨在通过使模型能够选择性遗忘特定数据,维护数据所有者的被遗忘权。近期方法提出,一种数据遗忘途径是预先计算并存储携带二阶信息的统计量,以提升计算和内存效率。然而,这些方法依赖严格的假设,且其计算和存储受限于模型参数维度的维度灾难,难以应用于大多数深度神经网络。在本工作中,我们提出了一种无海森矩阵的在线遗忘方法。我们建议为每个数据点维护一个统计向量,该向量通过重训练模型与学习模型之间差异的仿射随机递归近似计算得出。所提算法仅需向量加法操作,即可实现近乎瞬时的在线遗忘。基于为遗忘数据重收集统计量的策略,该方法显著减少了遗忘运行时。实验研究表明,所提方案在时间与内存成本上相较现有结果提升数个数量级,同时提高了准确性。