EdgePruner: Poisoned Edge Pruning in Graph Contrastive Learning

Graph Contrastive Learning (GCL) is unsupervised graph representation learning that can obtain useful representation of unknown nodes. The node representation can be utilized as features of downstream tasks. However, GCL is vulnerable to poisoning attacks as with existing learning models. A state-of-the-art defense cannot sufficiently negate adverse effects by poisoned graphs although such a defense introduces adversarial training in the GCL. To achieve further improvement, pruning adversarial edges is important. To the best of our knowledge, the feasibility remains unexplored in the GCL domain. In this paper, we propose a simple defense for GCL, EdgePruner. We focus on the fact that the state-of-the-art poisoning attack on GCL tends to mainly add adversarial edges to create poisoned graphs, which means that pruning edges is important to sanitize the graphs. Thus, EdgePruner prunes edges that contribute to minimizing the contrastive loss based on the node representation obtained after training on poisoned graphs by GCL. Furthermore, we focus on the fact that nodes with distinct features are connected by adversarial edges in poisoned graphs. Thus, we introduce feature similarity between neighboring nodes to help more appropriately determine adversarial edges. This similarity is helpful in further eliminating adverse effects from poisoned graphs on various datasets. Finally, EdgePruner outputs a graph that yields the minimum contrastive loss as the sanitized graph. Our results demonstrate that pruning adversarial edges is feasible on six datasets. EdgePruner can improve the accuracy of node classification under the attack by up to 5.55% compared with that of the state-of-the-art defense. Moreover, we show that EdgePruner is immune to an adaptive attack.

翻译：图对比学习（GCL）是一种无监督的图表示学习方法，能够获取未知节点的有效表示。这些节点表示可作为下游任务的特征。然而，与现有学习模型类似，GCL易受投毒攻击。尽管现有最先进的防御方法在GCL中引入了对抗训练，但仍不足以充分消除毒化图带来的负面影响。为实现进一步改进，剪除对抗性边至关重要。据我们所知，该可行性在GCL领域尚未被探索。本文提出一种针对GCL的简单防御方法——EdgePruner。我们关注到一个事实：当前最先进的GCL投毒攻击主要通过添加对抗性边来生成毒化图，这意味着剪除边对于净化图至关重要。因此，EdgePruner基于GCL在毒化图上训练后获得的节点表示，剪除有助于最小化对比损失的边。此外，我们注意到毒化图中具有不同特征的节点通过对抗性边相连。为此，我们引入相邻节点间的特征相似性，以更准确地识别对抗性边。该相似性有助于进一步消除毒化图在不同数据集上的负面影响。最终，EdgePruner输出使对比损失最小的图作为净化图。实验结果表明，在六个数据集上剪除对抗性边是可行的。与最先进的防御方法相比，EdgePruner可将攻击下的节点分类准确率提升高达5.55%。此外，我们证明EdgePruner能够抵御自适应攻击。