Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel

The fundamental principle of Graph Neural Networks (GNNs) is to exploit the structural information of the data by aggregating the neighboring nodes using a `graph convolution' in conjunction with a suitable choice for the network architecture, such as depth and activation functions. Therefore, understanding the influence of each of the design choice on the network performance is crucial. Convolutions based on graph Laplacian have emerged as the dominant choice with the symmetric normalization of the adjacency matrix as the most widely adopted one. However, some empirical studies show that row normalization of the adjacency matrix outperforms it in node classification. Despite the widespread use of GNNs, there is no rigorous theoretical study on the representation power of these convolutions, that could explain this behavior. Similarly, the empirical observation of the linear GNNs performance being on par with non-linear ReLU GNNs lacks rigorous theory. In this work, we theoretically analyze the influence of different aspects of the GNN architecture using the Graph Neural Tangent Kernel in a semi-supervised node classification setting. Under the population Degree Corrected Stochastic Block Model, we prove that: (i) linear networks capture the class information as good as ReLU networks; (ii) row normalization preserves the underlying class structure better than other convolutions; (iii) performance degrades with network depth due to over-smoothing, but the loss in class information is the slowest in row normalization; (iv) skip connections retain the class information even at infinite depth, thereby eliminating over-smoothing. We finally validate our theoretical findings numerically and on real datasets such as Cora and Citeseer.

翻译：[translated abstract in Chinese] 图神经网络（GNNs）的基本原理是通过使用“图卷积”聚合邻域节点，并结合网络架构（如深度和激活函数）的恰当选择，来挖掘数据的结构信息。因此，理解每种设计选择对网络性能的影响至关重要。基于图拉普拉斯算子的卷积已成为主流选择，其中邻接矩阵的对称归一化是最广泛采用的方式。然而，实证研究表明，邻接矩阵的行归一化在节点分类任务中表现更优。尽管GNNs被广泛使用，但目前尚无严格的理论研究能够解释这些卷积的表示能力差异及上述现象。类似地，线性GNNs性能与非线性的ReLU GNNs相当的实证观测也缺乏严格的理论支撑。本研究利用图神经正切核，在半监督节点分类场景中对GNN架构不同方面的影响进行了理论分析。在总体度修正随机块模型下，我们证明：（i）线性网络捕获类别信息的能力与ReLU网络相当；（ii）行归一化比其他卷积方式更能保留潜在类别结构；（iii）网络深度因过平滑问题导致性能下降，但行归一化在类别信息损失速度上最慢；（iv）跳跃连接即使在无限深度下仍能保留类别信息，从而消除过平滑。最终，我们通过数值实验以及Cora、Citeseer等真实数据集验证了理论发现。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日