In machine unlearning, $(\varepsilon,δ)-$unlearning is a popular framework that provides formal guarantees on the effectiveness of the removal of a subset of training data, the forget set, from a trained model. For strongly convex objectives, existing first-order methods achieve $(\varepsilon,δ)-$unlearning, but they only use the forget set to calibrate injected noise, never as a direct optimization signal. In contrast, efficient empirical heuristics often exploit the forget samples (e.g., via gradient ascent) but come with no formal unlearning guarantees. We bridge this gap by presenting the Variance-Reduced Unlearning (VRU) algorithm. To the best of our knowledge, VRU is the first first-order algorithm that directly includes forget set gradients in its update rule, while provably satisfying ($(\varepsilon,δ)-$unlearning. We establish the convergence of VRU and show that incorporating the forget set yields strictly improved rates, i.e. a better dependence on the achieved error compared to existing first-order $(\varepsilon,δ)-$unlearning methods. Moreover, we prove that, in a low-error regime, VRU asymptotically outperforms any first-order method that ignores the forget set.Experiments corroborate our theory, showing consistent gains over both state-of-the-art certified unlearning methods and over empirical baselines that explicitly leverage the forget set.
翻译:在机器遗忘领域,$(\varepsilon,\delta)$-遗忘是一个流行的框架,它为从已训练模型中有效移除部分训练数据(即遗忘集)提供了形式化保证。对于强凸目标函数,现有的一阶方法虽然能够实现$(\varepsilon,\delta)$-遗忘,但它们仅将遗忘集用于校准注入的噪声,从未将其作为直接的优化信号。相比之下,高效的启发式经验方法常利用遗忘样本(例如通过梯度上升),但缺乏形式化的遗忘保证。我们通过提出方差缩减遗忘(VRU)算法来弥合这一差距。据我们所知,VRU是首个在其更新规则中直接包含遗忘集梯度,同时可证明满足$(\varepsilon,\delta)$-遗忘的一阶算法。我们建立了VRU的收敛性,并证明引入遗忘集能严格提升收敛速率,即与现有的一阶$(\varepsilon,\delta)$-遗忘方法相比,在达到的误差上具有更优的依赖关系。此外,我们证明在低误差状态下,VRU渐近地优于任何忽略遗忘集的一阶方法。实验验证了我们的理论,结果显示VRU相比最先进的认证遗忘方法以及明确利用遗忘集的经验基线均取得了持续的性能提升。