Towards Better Evaluation of GNN Expressiveness with BREC Dataset

Research on the theoretical expressiveness of Graph Neural Networks (GNNs) has developed rapidly, and many methods have been proposed to enhance the expressiveness. However, most methods do not have a uniform expressiveness measure except for a few that strictly follow the $k$-dimensional Weisfeiler-Lehman ($k$-WL) test hierarchy. Their theoretical analyses are often limited to distinguishing certain families of non-isomorphic graphs, leading to difficulties in quantitatively comparing their expressiveness. In contrast to theoretical analysis, another way to measure expressiveness is by evaluating model performance on certain datasets containing 1-WL-indistinguishable graphs. Previous datasets specifically designed for this purpose, however, face problems with difficulty (any model surpassing 1-WL has nearly 100% accuracy), granularity (models tend to be either 100% correct or near random guess), and scale (only a few essentially different graphs in each dataset). To address these limitations, we propose a new expressiveness dataset, $\textbf{BREC}$, which includes 400 pairs of non-isomorphic graphs carefully selected from four primary categories (Basic, Regular, Extension, and CFI). These graphs have higher difficulty (up to 4-WL-indistinguishable), finer granularity (able to compare models between 1-WL and 3-WL), and a larger scale (400 pairs). Further, we synthetically test 23 models with higher-than-1-WL expressiveness on our BREC dataset. Our experiment gives the first thorough comparison of the expressiveness of those state-of-the-art beyond-1-WL GNN models. We expect this dataset to serve as a benchmark for testing the expressiveness of future GNNs. Our dataset and evaluation code are released at: https://github.com/GraphPKU/BREC.

翻译：图神经网络（GNN）理论表达能力的研究发展迅速，多种增强表达能力的方法已被提出。然而，除严格遵循$k$维Weisfeiler-Lehman（$k$-WL）测试层次的方法外，多数方法缺乏统一的表达能力度量标准。其理论分析通常局限于区分特定类别的非同构图，导致难以定量比较各方法的表达能力。与理论分析不同，另一种度量表达能力的途径是在包含1-WL不可区分图的数据集上评估模型性能。然而，此前专门为此设计的数据集存在以下问题：难度不足（任何超越1-WL的模型几乎都能达到100%准确率）、粒度粗糙（模型性能要么完全正确，要么接近随机猜测）、规模有限（每个数据集仅包含少量本质不同的图）。为解决这些局限，我们提出新的表达能力数据集$\textbf{BREC}$，该数据集包含从四大类别（基础图、正则图、扩展图与CFI图）中精心筛选的400对非同构图。这些图具有更高难度（最高可达4-WL不可区分）、更细粒度（能够比较1-WL至3-WL之间的模型）和更大规模（400对）。此外，我们在BREC数据集上对23个表达能力超越1-WL的模型进行了综合测试。实验首次系统比较了这些最先进的超越1-WL图神经网络模型的表达能力。我们期待该数据集能作为测试未来GNN表达能力的基准。数据集与评估代码已发布于：https://github.com/GraphPKU/BREC。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日