Machine unlearning, the ability to erase the effect of specific training samples without retraining from scratch, is critical for privacy, regulation, and efficiency. However, most progress in unlearning has been empirical, with little theoretical understanding of when and why unlearning works. We tackle this gap by framing unlearning through the lens of asymptotic linear stability to capture the interaction between optimization dynamics and data geometry. The key quantity in our analysis is data coherence which is the cross sample alignment of loss surface directions near the optimum. We decompose coherence along three axes: within the retain set, within the forget set, and between them, and prove tight stability thresholds that separate convergence from divergence. To further link data properties to forgettability, we study a two layer ReLU CNN under a signal plus noise model and show that stronger memorization makes forgetting easier: when the signal to noise ratio (SNR) is lower, cross sample alignment is weaker, reducing coherence and making unlearning easier; conversely, high SNR, highly aligned models resist unlearning. For empirical verification, we show that Hessian tests and CNN heatmaps align closely with the predicted boundary, mapping the stability frontier of gradient based unlearning as a function of batching, mixing, and data/model alignment. Our analysis is grounded in random matrix theory tools and provides the first principled account of the trade offs between memorization, coherence, and unlearning.
翻译:机器学习遗忘,即无需从头重新训练即可消除特定训练样本影响的能力,对于隐私保护、合规性和效率至关重要。然而,遗忘领域的大多数进展都是经验性的,对于遗忘何时以及为何有效缺乏理论理解。我们通过渐近线性稳定性的视角来构建遗忘框架,以捕捉优化动态与数据几何结构之间的相互作用。我们分析中的关键量是数据相干性,即最优解附近损失曲面方向在样本间的对齐程度。我们将相干性沿三个轴分解:保留集内部、遗忘集内部以及两者之间,并证明了区分收敛与发散的严格稳定性阈值。为了进一步将数据属性与可遗忘性联系起来,我们在信号加噪声模型下研究了一个两层ReLU CNN,并证明更强的记忆化使遗忘更容易:当信噪比(SNR)较低时,样本间对齐较弱,从而降低相干性并使遗忘更容易;反之,高SNR、高度对齐的模型则抵抗遗忘。为进行实证验证,我们展示了Hessian测试和CNN热图与预测边界高度吻合,描绘了基于梯度的遗忘的稳定性前沿,该前沿作为批处理、混合以及数据/模型对齐的函数。我们的分析基于随机矩阵理论工具,首次对记忆化、相干性与遗忘之间的权衡关系提供了原理性解释。