Self-Supervised Graph Structure Refinement for Graph Neural Networks

Graph structure learning (GSL), which aims to learn the adjacency matrix for graph neural networks (GNNs), has shown great potential in boosting the performance of GNNs. Most existing GSL works apply a joint learning framework where the estimated adjacency matrix and GNN parameters are optimized for downstream tasks. However, as GSL is essentially a link prediction task, whose goal may largely differ from the goal of the downstream task. The inconsistency of these two goals limits the GSL methods to learn the potential optimal graph structure. Moreover, the joint learning framework suffers from scalability issues in terms of time and space during the process of estimation and optimization of the adjacency matrix. To mitigate these issues, we propose a graph structure refinement (GSR) framework with a pretrain-finetune pipeline. Specifically, The pre-training phase aims to comprehensively estimate the underlying graph structure by a multi-view contrastive learning framework with both intra- and inter-view link prediction tasks. Then, the graph structure is refined by adding and removing edges according to the edge probabilities estimated by the pre-trained model. Finally, the fine-tuning GNN is initialized by the pre-trained model and optimized toward downstream tasks. With the refined graph structure remaining static in the fine-tuning space, GSR avoids estimating and optimizing graph structure in the fine-tuning phase which enjoys great scalability and efficiency. Moreover, the fine-tuning GNN is boosted by both migrating knowledge and refining graphs. Extensive experiments are conducted to evaluate the effectiveness (best performance on six benchmark datasets), efficiency, and scalability (13.8x faster using 32.8% GPU memory compared to the best GSL baseline on Cora) of the proposed model.

翻译：图结构学习（Graph Structure Learning, GSL）旨在为图神经网络（Graph Neural Networks, GNNs）学习邻接矩阵，已在提升GNN性能方面展现出巨大潜力。现有大多数GSL工作采用联合学习框架，其中估计的邻接矩阵与GNN参数针对下游任务进行优化。然而，由于GSL本质上是一项链路预测任务，其目标可能与下游任务的目标存在显著差异。这两个目标的不一致性限制了GSL方法学习潜在最优图结构。此外，联合学习框架在邻接矩阵的估计与优化过程中存在时间与空间上的可扩展性问题。为缓解这些问题，我们提出了一种采用预训练-微调流程的图结构精化（Graph Structure Refinement, GSR）框架。具体而言，预训练阶段旨在通过多视图对比学习框架，结合视图内与视图间的链路预测任务，全面估计底层图结构。随后，根据预训练模型估计的边概率，通过增删边来精化图结构。最后，微调阶段的GNN由预训练模型初始化，并针对下游任务进行优化。由于精化后的图结构在微调空间中保持静态，GSR避免了在微调阶段对图结构进行估计与优化，从而具有出色的可扩展性与效率。此外，微调GNN通过知识迁移与图精化双重机制得到增强。我们进行了广泛实验，以评估所提模型的有效性（在六个基准数据集上取得最佳性能）、效率及可扩展性（在Cora数据集上，与最佳GSL基线相比，使用32.8%的GPU内存即可实现13.8倍加速）。