Our study reveals new theoretical insights into over-smoothing and feature over-correlation in deep graph neural networks. We show the prevalence of invariant subspaces, demonstrating a fixed relative behavior that is unaffected by feature transformations. Our work clarifies recent observations related to convergence to a constant state and a potential over-separation of node states, as the amplification of subspaces only depends on the spectrum of the aggregation function. In linear scenarios, this leads to node representations being dominated by a low-dimensional subspace with an asymptotic convergence rate independent of the feature transformations. This causes a rank collapse of the node representations, resulting in over-smoothing when smooth vectors span this subspace, and over-correlation even when over-smoothing is avoided. Guided by our theory, we propose a sum of Kronecker products as a beneficial property that can provably prevent over-smoothing, over-correlation, and rank collapse. We empirically extend our insights to the non-linear case, demonstrating the inability of existing models to capture linearly independent features.
翻译:我们的研究揭示了深度图神经网络中过度平滑与特征过度相关的新理论见解。我们证明了不变子空间的普遍存在,揭示了不受特征变换影响的固定相对行为。本研究澄清了近期关于节点状态收敛至恒定状态及潜在过度分离现象的观测结果,并指出子空间的放大仅取决于聚合函数的谱特性。在线性场景中,这导致节点表示被低维子空间主导,其渐近收敛速率与特征变换无关。这种节点表示的秩坍塌会引发过度平滑(当平滑向量张成该子空间时)与过度相关性(即使避免过度平滑时亦然)。基于理论指导,我们提出克罗内克积之和作为一种有益特性,可严格证明其能防止过度平滑、过度相关性与秩坍塌。我们通过实验将理论见解扩展至非线性情形,论证了现有模型无法捕捉线性无关特征。