We consider the learning--unlearning paradigm defined as follows. First given a dataset, the goal is to learn a good predictor, such as one minimizing a certain loss. Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when learning from scratch on the surviving examples. We propose a new ticketed model for learning--unlearning wherein the learning algorithm can send back additional information in the form of a small-sized (encrypted) ``ticket'' to each participating training example, in addition to retaining a small amount of ``central'' information for later. Subsequently, the examples that wish to be unlearnt present their tickets to the unlearning algorithm, which additionally uses the central information to return a new predictor. We provide space-efficient ticketed learning--unlearning schemes for a broad family of concept classes, including thresholds, parities, intersection-closed classes, among others. En route, we introduce the count-to-zero problem, where during unlearning, the goal is to simply know if there are any examples that survived. We give a ticketed learning--unlearning scheme for this problem that relies on the construction of Sperner families with certain properties, which might be of independent interest.
翻译:我们考虑定义如下的学习-遗忘范式。首先,给定一个数据集,目标是学习一个好的预测器,例如最小化某个损失的预测器。随后,给定任意希望遗忘的样本子集,目标是在不知道原始训练数据集的情况下,学习一个好的预测器,该预测器应与从幸存样本上从头开始学习所产生的预测器相同。我们提出了一种新的有票学习-遗忘模型,其中学习算法除了保留少量“中心”信息供后续使用外,还可以向每个参与训练的样本发送一个小的(加密)“票证”形式的额外信息。随后,希望遗忘的样本将其票证提交给遗忘算法,该算法利用中心信息返回一个新的预测器。我们为广泛的概念类(包括阈值、奇偶校验、交闭类等)提供了空间高效的有票学习-遗忘方案。在此过程中,我们引入了计数归零问题,即在遗忘过程中,目标是简单了解是否存在任何幸存样本。我们针对该问题给出了一个有票学习-遗忘方案,该方案依赖于具有特定性质的Sperner族的构造,这可能具有独立的研究价值。