The recent Long-Range Graph Benchmark (LRGB, Dwivedi et al. 2022) introduced a set of graph learning tasks strongly dependent on long-range interaction between vertices. Empirical evidence suggests that on these tasks Graph Transformers significantly outperform Message Passing GNNs (MPGNNs). In this paper, we carefully reevaluate multiple MPGNN baselines as well as the Graph Transformer GPS (Ramp\'a\v{s}ek et al. 2022) on LRGB. Through a rigorous empirical analysis, we demonstrate that the reported performance gap is overestimated due to suboptimal hyperparameter choices. It is noteworthy that across multiple datasets the performance gap completely vanishes after basic hyperparameter optimization. In addition, we discuss the impact of lacking feature normalization for LRGB's vision datasets and highlight a spurious implementation of LRGB's link prediction metric. The principal aim of our paper is to establish a higher standard of empirical rigor within the graph machine learning community.
翻译:近期提出的长程图基准(LRGB,Dwivedi等人,2022)引入了一系列高度依赖顶点间长程交互的图学习任务。实证证据表明,在这些任务上,图Transformer显著优于消息传递图神经网络(MPGNN)。本文对LRGB中的多个MPGNN基线方法以及图Transformer GPS(Rampášek等人,2022)进行了仔细的重评估。通过严格的实证分析,我们证明所报告的性能差距因次优的超参数选择而被高估。值得注意的是,在多个数据集上,经基础超参数优化后,该性能差距完全消失。此外,我们讨论了LRGB视觉数据集缺乏特征归一化的影响,并指出LRGB链接预测指标中存在虚假实现。本文的主要目标是在图机器学习领域建立更高的实证严谨性标准。