Conformal unlearning aims to ensure that a trained conformal predictor miscovers data points with specific shared characteristics, such as those from a particular label class, associated with a specific user, or belonging to a defined cluster, while maintaining valid coverage on the remaining data. Existing machine unlearning methods, which typically approximate a model retrained from scratch after removing the data to be forgotten, face significant challenges when applied to conformal unlearning. These methods often lack rigorous, uncertainty-aware statistical measures to evaluate unlearning effectiveness and exhibit a mismatch between their degraded performance on forgotten data and the frequency with which that data are still correctly covered by conformal predictors-a phenomenon we term ''fake conformal unlearning''. To address these limitations, we propose a new paradigm for conformal machine unlearning that provides finite-sample, uncertainty-aware guarantees on unlearning performance without relying on a retrained model as a reference. We formalize conformal unlearning to require high coverage on retained data and high miscoverage on forgotten data, introduce practical empirical metrics for evaluation, and present an algorithm that optimizes these conformal objectives. Extensive experiments on vision and text benchmarks demonstrate that the proposed approach effectively removes targeted information while preserving utility.
翻译:保形遗忘学习旨在确保经过训练的保形预测器对具有特定共享特征的数据点(例如来自特定标签类别、关联特定用户或属于定义集群的数据)实现错误覆盖,同时在剩余数据上保持有效的覆盖范围。现有的机器遗忘学习方法通常通过近似从零开始重新训练移除待遗忘数据后的模型来实现,但在应用于保形遗忘学习时面临重大挑战。这些方法往往缺乏严格的、考虑不确定性的统计度量来评估遗忘效果,并且其在遗忘数据上表现下降的程度与保形预测器仍能正确覆盖该数据的频率之间存在不匹配现象——我们称之为“伪保形遗忘学习”。为解决这些局限性,我们提出了一种新的保形机器遗忘学习范式,该范式能在不依赖重新训练模型作为参考的情况下,为遗忘性能提供有限样本且考虑不确定性的理论保证。我们将保形遗忘学习形式化为对保留数据要求高覆盖率、对遗忘数据要求高错误覆盖率的双重目标,引入了实用的实证评估指标,并提出了一种优化这些保形目标的算法。在视觉和文本基准数据集上的大量实验表明,所提方法能有效移除目标信息同时保持模型实用性。