Graph diffusion equations are intimately related to graph neural networks (GNNs) and have recently attracted attention as a principled framework for analyzing GNN dynamics, formalizing their expressive power, and justifying architectural choices. One key open questions in graph learning is the generalization capabilities of GNNs. A major limitation of current approaches hinges on the assumption that the graph topologies in the training and test sets come from the same distribution. In this paper, we make steps towards understanding the generalization of GNNs by exploring how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies. We first show deficiencies in the generalization capability of existing models built upon local diffusion on graphs, stemming from the exponential sensitivity to topology variation. Our subsequent analysis reveals the promise of non-local diffusion, which advocates for feature propagation over fully-connected latent graphs, under the assumption of a specific data-generating condition. In addition to these findings, we propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations that have a closed-form solution backed up with theoretical guarantees of desired generalization under topological distribution shifts. The new model, functioning as a versatile graph Transformer, demonstrates superior performance across a wide range of graph learning tasks.
翻译:图扩散方程与图神经网络(GNN)密切相关,近期作为分析GNN动力学、形式化其表达能力及论证架构选择的理论框架受到广泛关注。图学习中的一个关键开放问题是GNN的泛化能力。当前方法的主要局限在于假设训练集与测试集的图拓扑结构来自同一分布。本文通过探索图扩散方程在变拓扑结构下的外推与泛化机制,向理解GNN泛化能力迈出重要一步。我们首先揭示了基于局部图扩散的现有模型在泛化能力上的缺陷,该缺陷源于对拓扑变化的指数敏感性。进一步分析表明,在特定数据生成条件下,倡导在全连接隐式图上进行特征传播的非局部扩散具有潜在优势。基于这些发现,我们提出了一种新型图编码器主干网络——平流扩散Transformer(ADiT),其灵感源自具有闭式解的平流图扩散方程,该方程在拓扑分布偏移下具备理论保障的理想泛化能力。作为通用图Transformer的新型模型,ADiT在多种图学习任务中展现出卓越性能。