Our study reveals new theoretical insights into over-smoothing and feature over-correlation in deep graph neural networks. We show the prevalence of invariant subspaces, demonstrating a fixed relative behavior that is unaffected by feature transformations. Our work clarifies recent observations related to convergence to a constant state and a potential over-separation of node states, as the amplification of subspaces only depends on the spectrum of the aggregation function. In linear scenarios, this leads to node representations being dominated by a low-dimensional subspace with an asymptotic convergence rate independent of the feature transformations. This causes a rank collapse of the node representations, resulting in over-smoothing when smooth vectors span this subspace, and over-correlation even when over-smoothing is avoided. Guided by our theory, we propose a sum of Kronecker products as a beneficial property that can provably prevent over-smoothing, over-correlation, and rank collapse. We empirically extend our insights to the non-linear case, demonstrating the inability of existing models to capture linearly independent features.
翻译:本研究揭示了深层图神经网络中过度平滑与特征过度相关性的新理论洞见。我们证明了不变子空间的普遍存在性,展示了不受特征变换影响的固定相对行为。本研究澄清了近期关于节点状态收敛至恒定状态以及可能出现的过度分离现象的观察结果,究其原因在于子空间的放大仅取决于聚合函数的谱结构。在线性场景下,节点表示被低维子空间主导,其渐近收敛速率独立于特征变换。这种特性引发节点表示的秩坍塌,当平滑向量张成该子空间时导致过度平滑,即便避免了过度平滑也会引发过度相关性。基于理论指导,我们提出克罗内克积之和作为有益性质,可证明性地防止过度平滑、过度相关性及秩坍塌。我们通过实验将理论洞见推广至非线性情形,揭示了现有模型无法捕捉线性独立特征的根本局限。