Designing expressive Graph Neural Networks (GNNs) is a central topic in learning graph-structured data. While numerous approaches have been proposed to improve GNNs in terms of the Weisfeiler-Lehman (WL) test, generally there is still a lack of deep understanding of what additional power they can systematically and provably gain. In this paper, we take a fundamentally different perspective to study the expressive power of GNNs beyond the WL test. Specifically, we introduce a novel class of expressivity metrics via graph biconnectivity and highlight their importance in both theory and practice. As biconnectivity can be easily calculated using simple algorithms that have linear computational costs, it is natural to expect that popular GNNs can learn it easily as well. However, after a thorough review of prior GNN architectures, we surprisingly find that most of them are not expressive for any of these metrics. The only exception is the ESAN framework, for which we give a theoretical justification of its power. We proceed to introduce a principled and more efficient approach, called the Generalized Distance Weisfeiler-Lehman (GD-WL), which is provably expressive for all biconnectivity metrics. Practically, we show GD-WL can be implemented by a Transformer-like architecture that preserves expressiveness and enjoys full parallelizability. A set of experiments on both synthetic and real datasets demonstrates that our approach can consistently outperform prior GNN architectures.
翻译:设计具有表达能力的图神经网络(GNN)是学习图结构数据的核心课题。尽管已有众多方法基于Weisfeiler-Lehman(WL)测试改进GNN,但目前对其能够系统且可证明地获得哪些额外表达能力仍缺乏深入理解。本文从一个根本不同的视角研究GNN超越WL测试的表达能力:我们通过图双连通性引入一类全新的表达能力度量指标,并强调其在理论与实践中的重要性。由于双连通性可通过线性复杂度的简单算法轻松计算,人们自然期望主流GNN也能轻易学习。然而,在对现有GNN架构进行全面回顾后,我们惊讶地发现大多数架构对这些度量指标均不具备表达能力——唯一的例外是ESAN框架,我们为其能力提供了理论证明。进而,我们提出一种名为广义距离Weisfeiler-Lehman(GD-WL)的原则性且更高效的方法,该方法可证明对所有双连通性度量均具有表达能力。在实际应用中,我们证明GD-WL可通过类似Transformer的架构实现,该架构既能保持表达能力又具备完全可并行化特性。在合成数据集与真实数据集上的系列实验表明,我们的方法能持续优于现有GNN架构。