Exact Generalisation Error Exposes Benchmarks Skew Graph Neural Networks Success (or Failure)

Graph Neural Networks (GNNs) have become the standard method for learning from networks across fields ranging from biology to social systems, yet a principled understanding of what enables them to extract meaningful representations, or why performance varies drastically between similar models, remains elusive. These questions can be answered through the generalisation error, which measures the discrepancy between a model's predictions and the true values it is meant to recover. Although several works have derived generalisation error bounds, learning theoretical bounds are typically loose, restricted to a single architecture, and offer limited insight into what governs generalisation in practice. In this work, we take a fundamentally different approach by deriving the exact generalisation error for a broad range of linear GNNs, including convolutional, PageRank-based, and attention-based models, through the lens of signal processing. Our exact generalisation error exposes a strong benchmark bias in existing literature: commonly used datasets exhibit high alignment between node features and the graph structure, inherently favouring architectures that rely on it. We further show that the similarity between connected nodes (homophily) decisively governs which architectures are best suited for a given graph, thereby explaining how specific benchmark properties systematically shape the reported performance in the literature. Together, these results explain when and why GNNs can effectively leverage structure and feature information, supporting the reliable application of GNNs.

翻译：图神经网络（GNNs）已成为从生物学到社会系统等多个领域中从网络进行学习的标准方法，然而，对于是什么让它们能够提取有意义的表示，或者为什么相似模型之间的性能差异巨大，仍缺乏原理性的理解。这些问题可以通过泛化误差来回答，泛化误差衡量模型预测与其旨在恢复的真实值之间的差异。尽管已有若干研究推导出泛化误差界，但学习理论界通常较为松散，局限于单一架构，并且对实践中支配泛化过程的因素提供的见解有限。在本研究中，我们采取了一种根本不同的方法，通过信号处理的视角，为包括卷积型、基于PageRank型和注意力型模型在内的广泛线性GNNs推导出精确的泛化误差。我们的精确泛化误差揭示了现有文献中存在的强基准偏差：常用数据集在节点特征与图结构之间表现出高度对齐，这固有地偏好依赖图结构的架构。我们进一步表明，连接节点之间的相似性（同质性）决定性地支配着哪种架构最适合给定图，从而解释了特定的基准属性如何系统性地塑造文献中报告的性能。综合这些结果，本研究解释了GNNs何时以及为何能有效利用结构和特征信息，从而支持GNNs的可靠应用。