Graph neural networks (GNN) are vulnerable to adversarial attacks, which aim to degrade the performance of GNNs through imperceptible changes on the graph. However, we find that in fact the prevalent meta-gradient-based attacks, which utilizes the gradient of the loss w.r.t the adjacency matrix, are biased towards training nodes. That is, their meta-gradient is determined by a training procedure of the surrogate model, which is solely trained on the training nodes. This bias manifests as an uneven perturbation, connecting two nodes when at least one of them is a labeled node, i.e., training node, while it is unlikely to connect two unlabeled nodes. However, these biased attack approaches are sub-optimal as they do not consider flipping edges between two unlabeled nodes at all. This means that they miss the potential attacked edges between unlabeled nodes that significantly alter the representation of a node. In this paper, we investigate the meta-gradients to uncover the root cause of the uneven perturbations of existing attacks. Based on our analysis, we propose a Meta-gradient-based attack method using contrastive surrogate objective (Metacon), which alleviates the bias in meta-gradient using a new surrogate loss. We conduct extensive experiments to show that Metacon outperforms existing meta gradient-based attack methods through benchmark datasets, while showing that alleviating the bias towards training nodes is effective in attacking the graph structure.
翻译:图神经网络(GNN)易受对抗攻击的影响,此类攻击旨在通过对图进行难以察觉的修改来降低GNN的性能。然而,我们发现,当前流行的基于元梯度的攻击方法——即利用损失函数对邻接矩阵的梯度——实际上存在对训练节点的偏向性。具体而言,其元梯度由代理模型的训练过程决定,而该模型仅基于训练节点进行训练。这种偏向性表现为一种不均匀的扰动:当两个节点中至少有一个是标注节点(即训练节点)时,攻击倾向于连接它们,而连接两个未标注节点的可能性则很低。然而,这种有偏的攻击方法并非最优,因为它们完全未考虑翻转两个未标注节点之间的边。这意味着它们错过了未标注节点之间那些可能显著改变节点表示的重要潜在攻击边。本文通过研究元梯度,揭示了现有攻击产生不均匀扰动的根本原因。基于分析,我们提出了一种使用对比代理目标的基于元梯度的攻击方法(Metacon),该方法通过一种新的代理损失函数来缓解元梯度中的偏向性。我们进行了大量实验,结果表明Metacon在多个基准数据集上优于现有的基于元梯度的攻击方法,同时证明了缓解对训练节点的偏向性在图结构攻击中是有效的。