Graph Neural Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours. Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges. On node-classification tasks, graph convolutions have been shown to suffer from two limitations: poor performance on heterophilic graphs, and over-smoothing. It is common belief that both phenomena occur because such models behave as low-pass filters, meaning that the Dirichlet energy of the features decreases along the layers incurring a smoothing effect that ultimately makes features no longer distinguishable. In this work, we rigorously prove that simple graph-convolutional models can actually enhance high frequencies and even lead to an asymptotic behaviour we refer to as over-sharpening, opposite to over-smoothing. We do so by showing that linear graph convolutions with symmetric weights minimize a multi-particle energy that generalizes the Dirichlet energy; in this setting, the weight matrices induce edge-wise attraction (repulsion) through their positive (negative) eigenvalues, thereby controlling whether the features are being smoothed or sharpened. We also extend the analysis to non-linear GNNs, and demonstrate that some existing time-continuous GNNs are instead always dominated by the low frequencies. Finally, we validate our theoretical findings through ablations and real-world experiments.
翻译:图神经网络(GNN)通常通过消息传递机制运行,其中节点状态基于从邻域接收的信息进行更新。大多数消息传递模型可视为图卷积运算——特征经共享线性变换混合后沿边传播。在节点分类任务中,图卷积已被证明存在两个局限性:异构图上性能欠佳,以及过度平滑现象。学界普遍认为,这两种现象源于此类模型表现为低通滤波器,即特征的狄利克雷能量随层数增加而递减,产生的平滑效应最终导致特征不可区分。本研究严格证明了:简单图卷积模型实际上能够增强高频信息,甚至会产生我们称为"过度锐化"的渐近行为,这与过度平滑恰好相反。具体而言,我们通过分析表明,对称权重线性图卷积最小化了一个推广狄利克雷能量的多粒子能量;在此框架下,权重矩阵通过其正(负)特征值诱导边上的吸引(排斥)效应,从而控制特征被平滑或锐化的方向。我们还将分析扩展至非线性GNN,并揭示现有的一些时间连续GNN始终由低频分量主导。最后,通过消融实验和真实场景验证了理论发现。