Graph neural networks (GNNs) have emerged as powerful tools for processing relational data in applications. However, GNNs suffer from the problem of oversmoothing, the property that the features of all nodes exponentially converge to the same vector over layers, prohibiting the design of deep GNNs. In this work we study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features. By generalizing methods from conventional deep neural networks (DNNs), we can describe the distribution of features at the output layer of deep GCNs in terms of a GP: as expected, we find that typical parameter choices from the literature lead to oversmoothing. The theory, however, allows us to identify a new, non-oversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth. We demonstrate the validity of this prediction in finite-size GCNs by training a linear classifier on their output. Moreover, using the linearization of the GCN GP, we generalize the concept of propagation depth of information from DNNs to GCNs. This propagation depth diverges at the transition between the oversmoothing and non-oversmoothing phase. We test the predictions of our approach and find good agreement with finite-size GCNs. Initializing GCNs near the transition to the non-oversmoothing phase, we obtain networks which are both deep and expressive.
翻译:图神经网络已成为处理应用中关系数据的强大工具。然而,图神经网络存在过度平滑问题,即所有节点的特征随网络层数增加呈指数级收敛至同一向量,这阻碍了深度图神经网络的设计。本研究通过利用图卷积网络在无限多隐藏特征极限下的高斯过程等价性,深入探究其过度平滑现象。通过推广传统深度神经网络的分析方法,我们能够用高斯过程描述深度图卷积网络输出层特征的分布:正如预期,文献中典型的参数选择确实会导致过度平滑。然而,该理论使我们发现了一种新的非过度平滑相态:若网络初始权重具有足够大的方差,图卷积网络不会出现过度平滑,即使在大深度下节点特征仍保持信息性。我们通过在有限规模图卷积网络输出端训练线性分类器验证了这一预测的有效性。此外,利用图卷积网络高斯过程的线性化理论,我们将信息传播深度的概念从深度神经网络推广至图卷积网络。该传播深度在过度平滑相与非过度平滑相的过渡点处发散。我们验证了该方法的预测结果,发现其与有限规模图卷积网络的实验数据高度吻合。通过在接近非过度平滑相变点处初始化图卷积网络,我们获得了兼具深度与强表达能力的网络架构。