Graph neural networks (GNNs) have pioneered advancements in graph representation learning, exhibiting superior feature learning and performance over multilayer perceptrons (MLPs) when handling graph inputs. However, understanding the feature learning aspect of GNNs is still in its initial stage. This study aims to bridge this gap by investigating the role of graph convolution within the context of feature learning theory in neural networks using gradient descent training. We provide a distinct characterization of signal learning and noise memorization in two-layer graph convolutional networks (GCNs), contrasting them with two-layer convolutional neural networks (CNNs). Our findings reveal that graph convolution significantly augments the benign overfitting regime over the counterpart CNNs, where signal learning surpasses noise memorization, by approximately factor $\sqrt{D}^{q-2}$, with $D$ denoting a node's expected degree and $q$ being the power of the ReLU activation function where $q > 2$. These findings highlight a substantial discrepancy between GNNs and MLPs in terms of feature learning and generalization capacity after gradient descent training, a conclusion further substantiated by our empirical simulations.
翻译:图神经网络(GNNs)在图表示学习领域取得了开创性进展,在处理图输入时展现出优于多层感知机(MLPs)的特征学习能力和性能。然而,对GNNs特征学习方面的理解仍处于初步阶段。本研究旨在通过梯度下降训练,从神经网络特征学习理论角度探究图卷积的作用来填补这一空白。我们明确刻画了两层图卷积网络(GCNs)中的信号学习与噪声记忆机制,并将其与两层卷积神经网络(CNNs)进行对比。研究发现,相较于CNNs,图卷积显著增强了良性过拟合区间(即信号学习超越噪声记忆的区域),增强幅度约为$\sqrt{D}^{q-2}$倍,其中$D$表示节点期望度数,$q$为ReLU激活函数的幂次且满足$q>2$。这些发现凸显了经过梯度下降训练后,GNNs与MLPs在特征学习与泛化能力上的显著差异,这一结论得到了实验模拟的进一步验证。