Oversmoothing is a fundamental challenge in graph neural networks (GNNs): as the number of layers increases, node embeddings become increasingly similar, and model performance drops sharply. Traditionally, oversmoothing has been quantified using metrics that measure the similarity of neighbouring node features, such as the Dirichlet energy. We argue that these metrics have critical limitations and fail to reliably capture oversmoothing in realistic scenarios. For instance, they provide meaningful insights only for very deep networks, while typical GNNs show a performance drop already with as few as 10 layers. As an alternative, we propose measuring oversmoothing by examining the numerical or effective rank of the feature representations. We provide extensive numerical evaluation across diverse graph architectures and datasets to show that rank-based metrics consistently capture oversmoothing, whereas energy-based metrics often fail. Notably, we reveal that drops in the rank align closely with performance degradation, even in scenarios where energy metrics remain unchanged. Along with the experimental evaluation, we provide theoretical support for this approach, clarifying why Dirichlet-like measures may fail to capture performance drop and proving that the numerical rank of feature representations collapses to one for a broad family of GNN architectures.
翻译:过平滑是图神经网络(GNNs)中的一个基础性挑战:随着网络层数的增加,节点嵌入会变得越来越相似,模型性能急剧下降。传统上,过平滑通常使用度量相邻节点特征相似性的指标来量化,例如狄利克雷能量。我们认为这些指标存在关键局限性,无法在现实场景中可靠地捕捉过平滑现象。例如,它们仅在网络极深时提供有意义的见解,而典型的GNNs在仅有约10层时就已经表现出性能下降。作为替代方案,我们建议通过考察特征表示的数值秩或有效秩来测量过平滑。我们在多种图架构和数据集上进行了广泛的数值评估,结果表明基于秩的指标能一致地捕捉过平滑,而基于能量的指标则常常失效。值得注意的是,我们发现秩的下降与性能退化紧密对应,即使在能量指标保持不变的情况下也是如此。除了实验评估,我们还为此方法提供了理论支持,阐明了为何狄利克雷类度量可能无法捕捉性能下降,并证明了对于一大类GNN架构,其特征表示的数值秩会坍缩至一。