Research on the theoretical expressiveness of Graph Neural Networks (GNNs) has developed rapidly, and many methods have been proposed to enhance the expressiveness. However, most methods do not have a uniform expressiveness measure except for a few that strictly follow the $k$-dimensional Weisfeiler-Lehman ($k$-WL) test hierarchy. Their theoretical analyses are often limited to distinguishing certain families of non-isomorphic graphs, leading to difficulties in quantitatively comparing their expressiveness. In contrast to theoretical analysis, another way to measure expressiveness is by evaluating model performance on certain datasets containing 1-WL-indistinguishable graphs. Previous datasets specifically designed for this purpose, however, face problems with difficulty (any model surpassing 1-WL has nearly 100% accuracy), granularity (models tend to be either 100% correct or near random guess), and scale (only a few essentially different graphs in each dataset). To address these limitations, we propose a new expressiveness dataset, $\textbf{BREC}$, which includes 400 pairs of non-isomorphic graphs carefully selected from four primary categories (Basic, Regular, Extension, and CFI). These graphs have higher difficulty (up to 4-WL-indistinguishable), finer granularity (able to compare models between 1-WL and 3-WL), and a larger scale (400 pairs). Further, we synthetically test 23 models with higher-than-1-WL expressiveness on our BREC dataset. Our experiment gives the first thorough comparison of the expressiveness of those state-of-the-art beyond-1-WL GNN models. We expect this dataset to serve as a benchmark for testing the expressiveness of future GNNs. Our dataset and evaluation code are released at: https://github.com/GraphPKU/BREC.
翻译:图神经网络(GNN)理论表达能力的研究发展迅速,多种增强表达能力的方法已被提出。然而,除严格遵循$k$维Weisfeiler-Lehman($k$-WL)测试层次的方法外,多数方法缺乏统一的表达能力度量标准。其理论分析通常局限于区分特定类别的非同构图,导致难以定量比较各方法的表达能力。与理论分析不同,另一种度量表达能力的途径是在包含1-WL不可区分图的数据集上评估模型性能。然而,此前专门为此设计的数据集存在以下问题:难度不足(任何超越1-WL的模型几乎都能达到100%准确率)、粒度粗糙(模型性能要么完全正确,要么接近随机猜测)、规模有限(每个数据集仅包含少量本质不同的图)。为解决这些局限,我们提出新的表达能力数据集$\textbf{BREC}$,该数据集包含从四大类别(基础图、正则图、扩展图与CFI图)中精心筛选的400对非同构图。这些图具有更高难度(最高可达4-WL不可区分)、更细粒度(能够比较1-WL至3-WL之间的模型)和更大规模(400对)。此外,我们在BREC数据集上对23个表达能力超越1-WL的模型进行了综合测试。实验首次系统比较了这些最先进的超越1-WL图神经网络模型的表达能力。我们期待该数据集能作为测试未来GNN表达能力的基准。数据集与评估代码已发布于:https://github.com/GraphPKU/BREC。