The "Graph pre-training and fine-tuning" paradigm has significantly improved Graph Neural Networks(GNNs) by capturing general knowledge without manual annotations for downstream tasks. However, due to the immense gap of data and tasks between the pre-training and fine-tuning stages, the model performance is still limited. Inspired by prompt fine-tuning in Natural Language Processing(NLP), many endeavors have been made to bridge the gap in graph domain. But existing methods simply reformulate the form of fine-tuning tasks to the pre-training ones. With the premise that the pre-training graphs are compatible with the fine-tuning ones, these methods typically operate in transductive setting. In order to generalize graph pre-training to inductive scenario where the fine-tuning graphs might significantly differ from pre-training ones, we propose a novel graph prompt based method called Inductive Graph Alignment Prompt(IGAP). Firstly, we unify the mainstream graph pre-training frameworks and analyze the essence of graph pre-training from graph spectral theory. Then we identify the two sources of the data gap in inductive setting: (i) graph signal gap and (ii) graph structure gap. Based on the insight of graph pre-training, we propose to bridge the graph signal gap and the graph structure gap with learnable prompts in the spectral space. A theoretical analysis ensures the effectiveness of our method. At last, we conduct extensive experiments among nodes classification and graph classification tasks under the transductive, semi-inductive and inductive settings. The results demonstrate that our proposed method can successfully bridge the data gap under different settings.
翻译:“图预训练与微调”范式通过捕获通用知识,无需人工标注即可为下游任务提升图神经网络性能。然而,由于预训练与微调阶段在数据和任务上的巨大差距,模型性能仍有限。受自然语言处理中提示微调的启发,研究者已在图领域尝试弥合这一差距。但现有方法仅是简单地将微调任务形式重构为预训练任务,且假设预训练图与微调图兼容,通常仅在直推式设置下运行。为使图预训练推广至微调图与预训练图可能存在显著差异的归纳场景,本文提出一种基于图提示的新方法——归纳图对齐提示(IGAP)。首先,统一主流图预训练框架,并从图谱理论角度分析图预训练的本质。随后,识别归纳设置下数据差距的两个来源:(i)图信号差距和(ii)图结构差距。基于图预训练的洞察,提出在谱空间中利用可学习提示弥合图信号差距与图结构差距,并通过理论分析确保方法有效性。最后,在直推式、半归纳式和归纳式设置下,针对节点分类与图分类任务开展广泛实验。结果表明,所提方法能成功弥合不同设置下的数据差距。