Machine unlearning has emerged as a new paradigm to deliberately forget data samples from a given model in order to adhere to stringent regulations. However, existing machine unlearning methods have been primarily focused on classification models, leaving the landscape of unlearning for generative models relatively unexplored. This paper serves as a bridge, addressing the gap by providing a unifying framework of machine unlearning for image-to-image generative models. Within this framework, we propose a computationally-efficient algorithm, underpinned by rigorous theoretical analysis, that demonstrates negligible performance degradation on the retain samples, while effectively removing the information from the forget samples. Empirical studies on two large-scale datasets, ImageNet-1K and Places-365, further show that our algorithm does not rely on the availability of the retain samples, which further complies with data retention policy. To our best knowledge, this work is the first that represents systemic, theoretical, empirical explorations of machine unlearning specifically tailored for image-to-image generative models. Our code is available at https://github.com/jpmorganchase/l2l-generator-unlearning.
翻译:机器遗忘作为一种新范式,旨在从给定模型中刻意遗忘数据样本以遵循严格法规。然而,现有机器遗忘方法主要集中在分类模型上,针对生成模型的遗忘领域仍相对未被探索。本文作为桥梁,通过提出一个统一的机器遗忘框架来弥合这一空白,该框架专为图像到图像生成模型设计。在该框架内,我们提出了一种计算高效的算法,该算法基于严谨的理论分析,在保留样本上表现出可忽略的性能下降,同时有效移除遗忘样本中的信息。在ImageNet-1K和Places-365两个大规模数据集上的实证研究进一步表明,我们的算法不依赖于保留样本的可用性,从而更符合数据保留政策。据我们所知,本工作首次系统性地从理论、实证层面探索了专门针对图像到图像生成模型的机器遗忘。我们的代码可在https://github.com/jpmorganchase/l2l-generator-unlearning获取。