Single Node Injection Label Specificity Attack on Graph Neural Networks via Reinforcement Learning

Graph neural networks (GNNs) have achieved remarkable success in various real-world applications. However, recent studies highlight the vulnerability of GNNs to malicious perturbations. Previous adversaries primarily focus on graph modifications or node injections to existing graphs, yielding promising results but with notable limitations. Graph modification attack~(GMA) requires manipulation of the original graph, which is often impractical, while graph injection attack~(GIA) necessitates training a surrogate model in the black-box setting, leading to significant performance degradation due to divergence between the surrogate architecture and the actual victim model. Furthermore, most methods concentrate on a single attack goal and lack a generalizable adversary to develop distinct attack strategies for diverse goals, thus limiting precise control over victim model behavior in real-world scenarios. To address these issues, we present a gradient-free generalizable adversary that injects a single malicious node to manipulate the classification result of a target node in the black-box evasion setting. We propose Gradient-free Generalizable Single Node Injection Attack, namely G$^2$-SNIA, a reinforcement learning framework employing Proximal Policy Optimization. By directly querying the victim model, G$^2$-SNIA learns patterns from exploration to achieve diverse attack goals with extremely limited attack budgets. Through comprehensive experiments over three acknowledged benchmark datasets and four prominent GNNs in the most challenging and realistic scenario, we demonstrate the superior performance of our proposed G$^2$-SNIA over the existing state-of-the-art baselines. Moreover, by comparing G$^2$-SNIA with multiple white-box evasion baselines, we confirm its capacity to generate solutions comparable to those of the best adversaries.

翻译：图神经网络（GNN）在各种实际应用中取得了显著成功。然而，近期研究强调了GNN易受恶意扰动影响的问题。现有攻击方法主要聚焦于图修改或向现有图注入节点，虽取得一定成效但存在明显局限性。图修改攻击（GMA）需要操作原始图，这在实际中往往不可行；而图注入攻击（GIA）需要在黑盒场景下训练代理模型，因代理架构与实际受害者模型存在差异而导致性能显著下降。此外，多数方法仅针对单一攻击目标，缺乏可针对不同目标生成不同攻击策略的通用化攻击方法，从而限制了实际场景中对受害者模型行为的精确控制。为解决上述问题，我们提出一种无需梯度的通用化攻击方法，通过注入单个恶意节点，在黑盒逃逸场景下操控目标节点的分类结果。我们提出G$^2$-SNIA（无梯度通用化单节点注入攻击），这是一种采用近端策略优化的强化学习框架。G$^2$-SNIA通过直接查询受害者模型，从探索中学习攻击模式，以极低的攻击预算实现多样化的攻击目标。通过在最具挑战性和现实性的场景下，对三个公认基准数据集和四种主流GNN进行综合实验，我们证明了所提出的G$^2$-SNIA相比现有最优基线方法具有更优性能。此外，通过与多种白盒逃逸基线方法的对比，我们证实了G$^2$-SNIA能够生成与最优攻击方法相媲美的攻击方案。