Revisiting the Necessity of Graph Learning and Common Graph Benchmarks

Graph machine learning has enjoyed a meteoric rise in popularity since the introduction of deep learning in graph contexts. This is no surprise due to the ubiquity of graph data in large scale industrial settings. Tacitly assumed in all graph learning tasks is the separation of the graph structure and node features: node features strictly encode individual data while the graph structure consists only of pairwise interactions. The driving belief is that node features are (by themselves) insufficient for these tasks, so benchmark performance accurately reflects improvements in graph learning. In our paper, we challenge this orthodoxy by showing that, surprisingly, node features are oftentimes more-than-sufficient for many common graph benchmarks, breaking this critical assumption. When comparing against a well-tuned feature-only MLP baseline on seven of the most commonly used graph learning datasets, one gains little benefit from using graph structure on five datasets. We posit that these datasets do not benefit considerably from graph learning because the features themselves already contain enough graph information to obviate or substantially reduce the need for the graph. To illustrate this point, we perform a feature study on these datasets and show how the features are responsible for closing the gap between MLP and graph-method performance. Further, in service of introducing better empirical measures of progress for graph neural networks, we present a challenging parametric family of principled synthetic datasets that necessitate graph information for nontrivial performance. Lastly, we section out a subset of real-world datasets that are not trivially solved by an MLP and hence serve as reasonable benchmarks for graph neural networks.

翻译：图机器学习自深度学习引入图领域以来经历了爆发式增长。这并不令人意外，因为图数据在大规模工业场景中无处不在。所有图学习任务中隐含的假设是图结构与节点特征的分离：节点特征严格编码个体数据，而图结构仅包含成对交互关系。核心观点认为节点特征（单独使用时）对这些任务而言是不充分的，因此基准性能能准确反映图学习的改进。本文中，我们挑战了这一传统观点，通过实验证明在许多常见图基准数据集中，节点特征往往不仅足够甚至绰绰有余，从而打破了这一关键假设。在七个最常用的图学习数据集上，与经过精心调优的纯特征多层感知机基线相比，有五个数据集使用图结构带来的性能增益微乎其微。我们认为这些数据集未能从图学习中显著受益，是因为特征本身已包含足够的图结构信息，从而消除或大幅降低了对显式图结构的需求。为阐明这一观点，我们对这些数据集进行了特征研究，展示了特征如何弥合多层感知机与图方法之间的性能差距。此外，为了建立更有效的图神经网络进展评估标准，我们提出了一组具有挑战性的参数化合成数据集，这些数据集需要利用图信息才能获得非平凡的性能表现。最后，我们筛选出部分无法被多层感知机轻易解决的真实世界数据集，这些数据集可作为评估图神经网络的合理基准。