Recently, subgraph GNNs have emerged as an important direction for developing expressive graph neural networks (GNNs). While numerous architectures have been proposed, so far there is still a limited understanding of how various design paradigms differ in terms of expressive power, nor is it clear what design principle achieves maximal expressiveness with minimal architectural complexity. Targeting these fundamental questions, this paper conducts a systematic study of general node-based subgraph GNNs through the lens of Subgraph Weisfeiler-Lehman Tests (SWL). Our central result is to build a complete hierarchy of SWL with strictly growing expressivity. Concretely, we prove that any node-based subgraph GNN falls into one of the six SWL equivalence classes, among which $\mathsf{SSWL}$ achieves the maximal expressive power. We also study how these equivalence classes differ in terms of their practical expressiveness such as encoding graph distance and biconnectivity. In addition, we give a tight expressivity upper bound of all SWL algorithms by establishing a close relation with localized versions of Folklore WL tests (FWL). Overall, our results provide insights into the power of existing subgraph GNNs, guide the design of new architectures, and point out their limitations by revealing an inherent gap with the 2-FWL test. Finally, experiments on the ZINC benchmark demonstrate that $\mathsf{SSWL}$-inspired subgraph GNNs can significantly outperform prior architectures despite great simplicity.
翻译:近期,子图图神经网络(GNN)已成为发展高表达力GNN的重要方向。尽管已有众多架构被提出,但迄今对不同设计范式在表达能力上的差异仍缺乏深入理解,也未明确何种设计原则能在最小化架构复杂度的同时实现最大表达力。针对这些基础性问题,本文通过子图Weisfeiler-Lehman测试(SWL)对通用节点级子图GNN进行了系统性研究。核心成果是构建了一个表达力严格递增的完整SWL层次结构。具体而言,我们证明了任意节点级子图GNN均属于六个SWL等价类之一,其中$\mathsf{SSWL}$实现了最大表达力。我们还研究了这些等价类在编码图距离与双连通性等实际表达力上的差异。此外,通过建立与局部化Folklore WL测试(FWL)的紧密关联,我们给出了所有SWL算法的严格表达力上界。总体而言,我们的研究成果揭示了现有子图GNN的能力边界,为新架构设计提供指导,并通过揭示其与2-FWL测试之间的固有差距指出局限性。最后,ZINC基准实验表明,尽管设计极为简洁,但基于$\mathsf{SSWL}$启发的子图GNN性能显著超越此前架构。