Machine unlearning is a promising paradigm for removing unwanted data samples from a trained model, towards ensuring compliance with privacy regulations and limiting harmful biases. Although unlearning has been shown in, e.g., classification and recommendation systems, its potential in medical image-to-image translation, specifically in image recon-struction, has not been thoroughly investigated. This paper shows that machine unlearning is possible in MRI tasks and has the potential to benefit for bias removal. We set up a protocol to study how much shared knowledge exists between datasets of different organs, allowing us to effectively quantify the effect of unlearning. Our study reveals that combining training data can lead to hallucinations and reduced image quality in the reconstructed data. We use unlearning to remove hallucinations as a proxy exemplar of undesired data removal. Indeed, we show that machine unlearning is possible without full retraining. Furthermore, our observations indicate that maintaining high performance is feasible even when using only a subset of retain data. We have made our code publicly accessible.
翻译:机器遗忘是一种有前景的范式,旨在从训练好的模型中移除不需要的数据样本,以确保符合隐私法规并限制有害偏见。尽管遗忘已在分类和推荐系统等领域得到验证,但其在医学图像到图像转换,特别是图像重建中的潜力尚未得到充分研究。本文表明,机器遗忘在磁共振成像任务中是可行的,并具有移除偏见的潜在益处。我们建立了一套协议来研究不同器官数据集之间存在多少共享知识,从而使我们能够有效量化遗忘的效果。我们的研究表明,合并训练数据可能导致重建数据中出现伪影并降低图像质量。我们利用遗忘来移除伪影,作为移除不期望数据的代理示例。事实上,我们证明了机器遗忘可以在不完全重新训练的情况下实现。此外,我们的观察表明,即使仅使用保留数据的一个子集,维持高性能也是可行的。我们已公开我们的代码。