We study the problem of $(\epsilon,\delta)$-certified machine unlearning for minimax models. Most of the existing works focus on unlearning from standard statistical learning models that have a single variable and their unlearning steps hinge on the direct Hessian-based conventional Newton update. We develop a new $(\epsilon,\delta)$-certified machine unlearning algorithm for minimax models. It proposes a minimax unlearning step consisting of a total-Hessian-based complete Newton update and the Gaussian mechanism borrowed from differential privacy. To obtain the unlearning certification, our method injects calibrated Gaussian noises by carefully analyzing the "sensitivity" of the minimax unlearning step (i.e., the closeness between the minimax unlearning variables and the retraining-from-scratch variables). We derive the generalization rates in terms of population strong and weak primal-dual risk for three different cases of loss functions, i.e., (strongly-)convex-(strongly-)concave losses. We also provide the deletion capacity to guarantee that a desired population risk can be maintained as long as the number of deleted samples does not exceed the derived amount. With training samples $n$ and model dimension $d$, it yields the order $\mathcal O(n/d^{1/4})$, which shows a strict gap over the baseline method of differentially private minimax learning that has $\mathcal O(n/d^{1/2})$. In addition, our rates of generalization and deletion capacity match the state-of-the-art rates derived previously for standard statistical learning models.
翻译:本文研究极小极大模型的$(\epsilon,\delta)$-可认证机器遗忘问题。现有工作主要关注单变量标准统计学习模型的遗忘,其遗忘步骤依赖于基于Hessian矩阵的传统牛顿更新。我们针对极小极大模型提出了一种新的$(\epsilon,\delta)$-可认证机器遗忘算法。该算法设计了一个极小极大遗忘步骤,包含基于全Hessian矩阵的完整牛顿更新以及借鉴自差分隐私的高斯机制。为获得遗忘认证,本方法通过精细分析极小极大遗忘步骤的"敏感度"(即遗忘变量与从头训练变量之间的接近程度)来注入经校准的高斯噪声。针对损失函数的三种不同情形——(强)凸-(强)凹损失,我们推导了基于总体强/弱原始-对偶风险的泛化率。同时提供了删除容量保证:只要删除样本数量不超过推导值,即可维持期望的总体风险。在训练样本数为$n$、模型维度为$d$的条件下,该方法达到$\mathcal O(n/d^{1/4})$量级,较差分隐私极小极大学习的基线方法$\mathcal O(n/d^{1/2})$存在严格优势。此外,本文所得泛化率与删除容量结果与先前针对标准统计学习模型推导的最优速率保持一致。