When to Pre-Train Graph Neural Networks? An Answer from Data Generation Perspective!

Recently, graph pre-training has attracted wide research attention, which aims to learn transferable knowledge from unlabeled graph data so as to improve downstream performance. Despite these recent attempts, the negative transfer is a major issue when applying graph pre-trained models to downstream tasks. Existing works made great efforts on the issue of what to pre-train and how to pre-train by designing a number of graph pre-training and fine-tuning strategies. However, there are indeed cases where no matter how advanced the strategy is, the "pre-train and fine-tune" paradigm still cannot achieve clear benefits. This paper introduces a generic framework W2PGNN to answer the crucial question of when to pre-train (i.e., in what situations could we take advantage of graph pre-training) before performing effortful pre-training or fine-tuning. We start from a new perspective to explore the complex generative mechanisms from the pre-training data to downstream data. In particular, W2PGNN first fits the pre-training data into graphon bases, each element of graphon basis (i.e., a graphon) identifies a fundamental transferable pattern shared by a collection of pre-training graphs. All convex combinations of graphon bases give rise to a generator space, from which graphs generated form the solution space for those downstream data that can benefit from pre-training. In this manner, the feasibility of pre-training can be quantified as the generation probability of the downstream data from any generator in the generator space. W2PGNN provides three broad applications, including providing the application scope of graph pre-trained models, quantifying the feasibility of performing pre-training, and helping select pre-training data to enhance downstream performance. We give a theoretically sound solution for the first application and extensive empirical justifications for the latter two applications.

翻译：近期，图预训练因其能从无标签图数据中学习可迁移知识以提升下游任务性能而受到广泛关注。尽管已有诸多尝试，负迁移仍是应用图预训练模型至下游任务时面临的主要问题。现有研究通过设计大量图预训练与微调策略，在“预训练什么”与“如何预训练”方面取得了重要进展。然而，确实存在某些情形，无论采用何等先进的策略，“预训练-微调”范式仍无法带来显著收益。本文提出通用框架W2PGNN，旨在努力进行预训练或微调之前回答关键问题——何时应进行预训练（即，何种情境下能受益于图预训练）。我们从一个全新视角出发，探索从预训练数据到下游数据的复杂生成机制。具体而言，W2PGNN首先将预训练数据拟合至图基元上，每个图基元元素（即一个图基元）识别出一组预训练图共享的基础可迁移模式。图基元的所有凸组合构成生成器空间，由此生成的图则为能受益于预训练的下游数据形成解空间。通过此方式，预训练的可行性可量化为下游数据被生成器空间中任一生成器生成的概率。W2PGNN提供三类广泛应用：界定图预训练模型的应用范围、量化执行预训练的可行性、以及辅助选择预训练数据以增强下游性能。我们对首个应用给出了理论完备的解决方案，并对后两个应用提供了充分的实证验证。