Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Some works are inspired by $k$-WL/FWL (Folklore WL) and design the corresponding neural versions. Despite the high expressive power, there are serious limitations in this line of research. In particular, (1) $k$-WL/FWL requires at least $O(n^k)$ space complexity, which is impractical for large graphs even when $k=3$; (2) The design space of $k$-WL/FWL is rigid, with the only adjustable hyper-parameter being $k$. To tackle the first limitation, we propose an extension, $(k,t)$-FWL. We theoretically prove that even if we fix the space complexity to $O(n^k)$ (for any $k\geq 2$) in $(k,t)$-FWL, we can construct an expressiveness hierarchy up to solving the graph isomorphism problem. To tackle the second problem, we propose $k$-FWL+, which considers any equivariant set as neighbors instead of all nodes, thereby greatly expanding the design space of $k$-FWL. Combining these two modifications results in a flexible and powerful framework $(k,t)$-FWL+. We demonstrate $(k,t)$-FWL+ can implement most existing models with matching expressiveness. We then introduce an instance of $(k,t)$-FWL+ called Neighborhood$^2$-FWL (N$^2$-FWL), which is practically and theoretically sound. We prove that N$^2$-FWL is no less powerful than 3-WL, and can encode many substructures while only requiring $O(n^2)$ space. Finally, we design its neural version named N$^2$-GNN and evaluate its performance on various tasks. N$^2$-GNN achieves record-breaking results on ZINC-Subset (0.059) and ZINC-Full (0.013), outperforming previous SOTA results by 10.6% and 40.9%, respectively. Moreover, N$^2$-GNN achieves new SOTA results on the BREC dataset (71.8%) among all existing high-expressive GNN methods.
翻译:消息传递神经网络(MPNNs)近年来已成为图神经网络(GNNs)中最流行的框架。然而,其表达能力受限于1维Weisfeiler-Lehman(1-WL)测试。部分研究受$k$-WL/FWL(Folklore WL)启发,设计了相应的神经版本。尽管表达能力较高,该研究方向仍存在严重局限性,具体包括:(1)$k$-WL/FWL至少需要$O(n^k)$的空间复杂度,即使当$k=3$时,对于大规模图而言也不切实际;(2)$k$-WL/FWL的设计空间僵化,唯一可调的超参数仅为$k$。针对第一个局限,我们提出扩展的$(k,t)$-FWL。理论上证明,即使将$(k,t)$-FWL的空间复杂度固定为$O(n^k)$(对任意$k\geq 2$),我们仍能构建自表达能力层次直至解决图同构问题。针对第二个问题,我们提出$k$-FWL+,该框架将任意等变集作为邻居而非全部节点,从而极大扩展了$k$-FWL的设计空间。结合这两项改进,得到了一个灵活且强大的$(k,t)$-FWL+框架。我们证明$(k,t)$-FWL+能以匹配的表达能力实现大多数现有模型。随后,我们引入一个称为Neighborhood$^2$-FWL(N$^2$-FWL)的$(k,t)$-FWL+实例,该实例在理论与实践上均具可靠性。我们证明N$^2$-FWL的表达能力不低于3-WL,且仅需$O(n^2)$空间即可编码多种子结构。最后,我们设计了其神经版本N$^2$-GNN,并在多种任务上评估其性能。N$^2$-GNN在ZINC-Subset(0.059)和ZINC-Full(0.013)上取得突破性结果,分别比先前最先进结果提升10.6%和40.9%。此外,在BREC数据集(71.8%)上,N$^2$-GNN在所有现有高表达能力GNN方法中达到新的最先进水平。