Designing expressive Graph Neural Networks (GNNs) is a central topic in learning graph-structured data. While numerous approaches have been proposed to improve GNNs in terms of the Weisfeiler-Lehman (WL) test, generally there is still a lack of deep understanding of what additional power they can systematically and provably gain. In this paper, we take a fundamentally different perspective to study the expressive power of GNNs beyond the WL test. Specifically, we introduce a novel class of expressivity metrics via graph biconnectivity and highlight their importance in both theory and practice. As biconnectivity can be easily calculated using simple algorithms that have linear computational costs, it is natural to expect that popular GNNs can learn it easily as well. However, after a thorough review of prior GNN architectures, we surprisingly find that most of them are not expressive for any of these metrics. The only exception is the ESAN framework (Bevilacqua et al., 2022), for which we give a theoretical justification of its power. We proceed to introduce a principled and more efficient approach, called the Generalized Distance Weisfeiler-Lehman (GD-WL), which is provably expressive for all biconnectivity metrics. Practically, we show GD-WL can be implemented by a Transformer-like architecture that preserves expressiveness and enjoys full parallelizability. A set of experiments on both synthetic and real datasets demonstrates that our approach can consistently outperform prior GNN architectures.
翻译:设计具有表达能力的图神经网络(GNN)是学习图结构数据的核心课题。尽管已有大量方法基于Weisfeiler-Lehman(WL)测试来提升GNN性能,但对于这些方法究竟能在哪些方面系统且可证明地获得额外表达能力,学界仍缺乏深入理解。本文从根本不同的视角研究超越WL测试的GNN表达能力。具体而言,我们通过图双连通性引入一类全新的表达能力指标,并强调其在理论与实践中的重要性。由于双连通性可通过线性计算复杂度的简单算法轻松求得,人们自然预期主流GNN也能轻易学习它。然而,在全面回顾现有GNN架构后,我们惊讶地发现大多数方法在所有这些指标上均不具备表达能力,唯一例外的是ESAN框架(Bevilacqua等人,2022年),我们为该框架的表达能力提供了理论依据。在此基础上,我们提出一种原则性更强且效率更高的方法——广义距离Weisfeiler-Lehman(GD-WL),该方法可被证明对所有双连通性指标都具有表达能力。实践层面,我们展示了GD-WL可通过类Transformer架构实现,该架构既保留表达能力又具备完全并行化能力。在合成数据集与真实数据集上的系列实验表明,我们的方法能持续优于现有GNN架构。