Recently, subgraph GNNs have emerged as an important direction for developing expressive graph neural networks (GNNs). While numerous architectures have been proposed, so far there is still a limited understanding of how various design paradigms differ in terms of expressive power, nor is it clear what design principle achieves maximal expressiveness with minimal architectural complexity. To address these fundamental questions, this paper conducts a systematic study of general node-based subgraph GNNs through the lens of Subgraph Weisfeiler-Lehman Tests (SWL). Our central result is to build a complete hierarchy of SWL with strictly growing expressivity. Concretely, we prove that any node-based subgraph GNN falls into one of the six SWL equivalence classes, among which $\mathsf{SSWL}$ achieves the maximal expressive power. We also study how these equivalence classes differ in terms of their practical expressiveness such as encoding graph distance and biconnectivity. Furthermore, we give a tight expressivity upper bound of all SWL algorithms by establishing a close relation with localized versions of WL and Folklore WL (FWL) tests. Our results provide insights into the power of existing subgraph GNNs, guide the design of new architectures, and point out their limitations by revealing an inherent gap with the 2-FWL test. Finally, experiments demonstrate that $\mathsf{SSWL}$-inspired subgraph GNNs can significantly outperform prior architectures on multiple benchmarks despite great simplicity.
翻译:最近,子图GNN已成为发展表达性图神经网络(GNN)的重要方向。尽管已有大量架构被提出,但目前对于不同设计范式在表达能力上的差异仍缺乏深入理解,也不清楚何种设计原则能以最小架构复杂度实现最大表达能力。为解决这些基本问题,本文通过子图Weisfeiler-Lehman测试(SWL)的视角,对通用基于节点的子图GNN进行了系统性研究。我们的核心成果是构建了一个具有严格递增表达能力的完整SWL层级。具体而言,我们证明任何基于节点的子图GNN均属于六个SWL等价类之一,其中$\mathsf{SSWL}$实现了最大表达能力。我们还研究了这些等价类在图距离编码和双连通性等实际表达能力上的差异。此外,通过建立与局部化WL和民间WL(FWL)测试的紧密联系,我们给出了所有SWL算法的严格表达能力上界。这些结果揭示了现有子图GNN的能力,指导了新架构的设计,并通过揭示其与2-FWL测试之间的固有差距指出了局限性。最后,实验表明,尽管结构简单,受$\mathsf{SSWL}$启发的子图GNN在多个基准测试上显著优于先前架构。