Rethinking Graph Generalization through the Lens of Sharpness-Aware Minimization

Graph Neural Networks (GNNs) have achieved remarkable success across various graph-based tasks but remain highly sensitive to distribution shifts. In this work, we focus on a prevalent yet under-explored phenomenon in graph generalization, Minimal Shift Flip (MSF),where test samples that slightly deviate from the training distribution are abruptly misclassified. To interpret this phenomenon, we revisit MSF through the lens of Sharpness-Aware Minimization (SAM), which characterizes the local stability and sharpness of the loss landscape while providing a theoretical foundation for modeling generalization error. To quantify loss sharpness, we introduce the concept of Local Robust Radius, measuring the smallest perturbation required to flip a prediction and establishing a theoretical link between local stability and generalization. Building on this perspective, we further observe a continual decrease in the robust radius during training, indicating weakened local stability and an increasingly sharp loss landscape that gives rise to MSF. To jointly solve the MSF phenomenon and the intractability of radius, we develop an energy-based formulation that is theoretically proven to be monotonically correlated with the robust radius, offering a tractable and principled objective for modeling flatness and stability. Building on these insights, we propose an energy-driven generative augmentation framework (E2A) that leverages energy-guided latent perturbations to generate pseudo-OOD samples and enhance model generalization. Extensive experiments across multiple benchmarks demonstrate that E2A consistently improves graph OOD generalization, outperforming state-of-the-art baselines.

翻译：图神经网络（GNNs）在各种基于图的任务中取得了显著成功，但对分布偏移仍然高度敏感。在本工作中，我们聚焦于图泛化中一个普遍但尚未被充分探索的现象——最小偏移翻转（MSF），即测试样本仅轻微偏离训练分布时即被错误分类。为解释这一现象，我们通过锐度感知最小化（SAM）的视角重新审视MSF，该方法通过刻画损失景观的局部稳定性与锐度特性，为建模泛化误差提供了理论基础。为量化损失锐度，我们引入局部鲁棒半径的概念，用以度量翻转预测所需的最小扰动，从而建立局部稳定性与泛化之间的理论联系。基于此视角，我们进一步观察到训练过程中鲁棒半径持续下降，表明局部稳定性减弱且损失景观日趋尖锐，最终引发MSF现象。为同时解决MSF现象与半径计算的难处理性，我们提出一种基于能量的理论框架，该框架被严格证明与鲁棒半径呈单调相关关系，从而为建模平坦性与稳定性提供了可处理且具有理论依据的优化目标。基于这些发现，我们提出能量驱动的生成式增强框架（E2A），该框架利用能量引导的隐空间扰动生成伪分布外样本以增强模型泛化能力。在多个基准数据集上的大量实验表明，E2A能持续提升图分布外泛化性能，显著优于现有最先进的基线方法。