Positional and structural encodings (PSE) enable better identifiability of nodes within a graph, as in general graphs lack a canonical node ordering. This renders PSEs essential tools for empowering modern GNNs, and in particular graph Transformers. However, designing PSEs that work optimally for a variety of graph prediction tasks is a challenging and unsolved problem. Here, we present the graph positional and structural encoder (GPSE), a first-ever attempt to train a graph encoder that captures rich PSE representations for augmenting any GNN. GPSE can effectively learn a common latent representation for multiple PSEs, and is highly transferable. The encoder trained on a particular graph dataset can be used effectively on datasets drawn from significantly different distributions and even modalities. We show that across a wide range of benchmarks, GPSE-enhanced models can significantly improve the performance in certain tasks, while performing on par with those that employ explicitly computed PSEs in other cases. Our results pave the way for the development of large pre-trained models for extracting graph positional and structural information and highlight their potential as a viable alternative to explicitly computed PSEs as well as to existing self-supervised pre-training approaches.
翻译:位置与结构编码(PSE)能够增强图中节点的可辨识性,因为一般图缺乏规范的节点排序。这使得PSE成为赋能现代图神经网络(GNN),特别是图Transformer的重要工具。然而,设计能适用于各类图预测任务的最优PSE仍是一个具有挑战性且尚未解决的问题。本文提出了图位置与结构编码器(GPSE),首次尝试训练一个能够捕获丰富PSE表示以增强任意GNN的图编码器。GPSE可以有效学习多种PSE的共享潜在表示,并具有高度可迁移性。在特定图数据集上训练的编码器可有效应用于分布乃至模态差异显著的数据集。我们证明,在广泛基准测试中,GPSE增强模型能在某些任务上显著提升性能,同时在另一些任务中与直接计算PSE的模型表现相当。该研究成果为开发用于提取图位置与结构信息的大型预训练模型铺平了道路,并凸显了其作为显式计算PSE及现有自监督预训练方法可行替代方案的潜力。