Although Transformer has achieved great success in natural language process and computer vision, it has difficulty generalizing to medium and large-scale graph data for two important reasons: (i) High complexity. (ii) Failing to capture the complex and entangled structure information. In graph representation learning, Graph Neural Networks(GNNs) can fuse the graph structure and node attributes but have limited receptive fields. Therefore, we question whether can we combine Transformers and GNNs to help each other. In this paper, we propose a new model named TransGNN where the Transformer layer and GNN layer are used alternately to improve each other. Specifically, to expand the receptive field and disentangle the information aggregation from edges, we propose using Transformer to aggregate more relevant nodes' information to improve the message passing of GNNs. Besides, to capture the graph structure information, we utilize positional encoding and make use of the GNN layer to fuse the structure into node attributes, which improves the Transformer in graph data. We also propose to sample the most relevant nodes for Transformer and two efficient samples update strategies to lower the complexity. At last, we theoretically prove that TransGNN is more expressive than GNNs only with extra linear complexity. The experiments on eight datasets corroborate the effectiveness of TransGNN on node and graph classification tasks.
翻译:尽管Transformer在自然语言处理和计算机视觉领域取得了巨大成功,但由于两个重要原因,它难以推广到中大规模图数据:(i) 高复杂度;(ii) 无法捕获复杂且纠缠的结构信息。在图表示学习中,图神经网络(GNN)能够融合图结构与节点属性,但其感受野有限。因此,我们质疑能否将Transformer与GNN结合以相互促进。本文提出一种名为TransGNN的新模型,其中Transformer层与GNN层交替使用以相互提升。具体而言,为扩展感受野并解耦边缘信息聚合,我们提出利用Transformer聚合更相关节点的信息,以改进GNN的消息传递机制。此外,为捕获图结构信息,我们采用位置编码,并利用GNN层将结构信息融入节点属性,从而改进Transformer在图数据上的表现。我们还提出为Transformer采样最相关节点,并设计两种高效采样更新策略以降低复杂度。最后,我们从理论上证明,TransGNN仅需额外线性复杂度即可比单一GNN更具表达能力。在八个数据集上的实验验证了TransGNN在节点与图分类任务中的有效性。