Graph unlearning, which aims to eliminate the influence of specific nodes, edges, or attributes from a trained Graph Neural Network (GNN), is essential in applications where privacy, bias, or data obsolescence is a concern. However, existing graph unlearning techniques often necessitate additional training on the remaining data, leading to significant computational costs, particularly with large-scale graphs. To address these challenges, we propose a two-stage training-free approach, Erase then Rectify (ETR), designed for efficient and scalable graph unlearning while preserving the model utility. Specifically, we first build a theoretical foundation showing that masking parameters critical for unlearned samples enables effective unlearning. Building on this insight, the Erase stage strategically edits model parameters to eliminate the impact of unlearned samples and their propagated influence on intercorrelated nodes. To further ensure the GNN's utility, the Rectify stage devises a gradient approximation method to estimate the model's gradient on the remaining dataset, which is then used to enhance model performance. Overall, ETR achieves graph unlearning without additional training or full training data access, significantly reducing computational overhead and preserving data privacy. Extensive experiments on seven public datasets demonstrate the consistent superiority of ETR in model utility, unlearning efficiency, and unlearning effectiveness, establishing it as a promising solution for real-world graph unlearning challenges.
翻译:图遗忘旨在从训练好的图神经网络(GNN)中消除特定节点、边或属性的影响,在涉及隐私、偏见或数据过时的应用中至关重要。然而,现有的图遗忘技术通常需要在剩余数据上进行额外训练,导致显著的计算成本,尤其在大规模图上。为应对这些挑战,我们提出一种两阶段无训练方法——擦除后校正(ETR),旨在实现高效且可扩展的图遗忘,同时保持模型效用。具体而言,我们首先建立理论基础,证明掩蔽对遗忘样本关键的参数可实现有效遗忘。基于这一洞见,擦除阶段通过策略性编辑模型参数,消除遗忘样本及其在互相关联节点上传播的影响。为进一步保障GNN的效用,校正阶段设计了一种梯度近似方法,用于估计模型在剩余数据集上的梯度,进而提升模型性能。总体而言,ETR无需额外训练或完整训练数据访问即可实现图遗忘,显著降低了计算开销并保护了数据隐私。在七个公开数据集上的大量实验表明,ETR在模型效用、遗忘效率和遗忘效果方面均持续优于现有方法,为现实世界中的图遗忘挑战提供了有前景的解决方案。