Message passing is the dominant paradigm in Graph Neural Networks (GNNs). The efficiency of message passing, however, can be limited by the topology of the graph. This happens when information is lost during propagation due to being oversquashed when travelling through bottlenecks. To remedy this, recent efforts have focused on graph rewiring techniques, which disconnect the input graph originating from the data and the computational graph, on which message passing is performed. A prominent approach for this is to use discrete graph curvature measures, of which several variants have been proposed, to identify and rewire around bottlenecks, facilitating information propagation. While oversquashing has been demonstrated in synthetic datasets, in this work we reevaluate the performance gains that curvature-based rewiring brings to real-world datasets. We show that in these datasets, edges selected during the rewiring process are not in line with theoretical criteria identifying bottlenecks. This implies they do not necessarily oversquash information during message passing. Subsequently, we demonstrate that SOTA accuracies on these datasets are outliers originating from sweeps of hyperparameters -- both the ones for training and dedicated ones related to the rewiring algorithm -- instead of consistent performance gains. In conclusion, our analysis nuances the effectiveness of curvature-based rewiring in real-world datasets and brings a new perspective on the methods to evaluate GNN accuracy improvements.
翻译:消息传递是图神经网络(GNNs)中的主导范式。然而,消息传递的效率可能受限于图的拓扑结构。当信息在传播过程中因流经瓶颈区域而被过度挤压时,就会发生信息丢失。为解决此问题,近期研究集中于图重连技术,该技术将源自数据的输入图与执行消息传递的计算图分离开来。一种主流方法是利用离散图曲率度量(已提出多种变体)来识别瓶颈并围绕其进行重连,从而促进信息传播。尽管过度挤压现象已在合成数据集中得到证实,但在本工作中,我们重新评估了基于曲率的重连为真实世界数据集带来的性能增益。我们表明,在这些数据集中,重连过程中选择的边与识别瓶颈的理论标准并不一致。这意味着它们在消息传递过程中不一定会过度挤压信息。随后,我们证明这些数据集上的SOTA准确率是超参数扫描(包括训练相关超参数以及重连算法专用超参数)产生的异常值,而非持续的性能提升。总之,我们的分析细化了基于曲率的重连在真实世界数据集中的有效性,并为评估GNN准确率提升的方法提供了新的视角。