A common theoretical approach to understanding neural networks is to take an infinite-width limit, at which point the outputs become Gaussian process (GP) distributed. This is known as a neural network Gaussian process (NNGP). However, the NNGP kernel is fixed, and tunable only through a small number of hyperparameters, eliminating any possibility of representation learning. This contrasts with finite-width NNs, which are often believed to perform well precisely because they are able to learn representations. Thus in simplifying NNs to make them theoretically tractable, NNGPs may eliminate precisely what makes them work well (representation learning). This motivated us to understand whether representation learning is necessary in a range of graph classification tasks. We develop a precise tool for this task, the graph convolutional deep kernel machine. This is very similar to an NNGP, in that it is an infinite width limit and uses kernels, but comes with a `knob' to control the amount of representation learning. We found that representation learning is necessary (in the sense that it gives dramatic performance improvements) in graph classification tasks and heterophilous node classification tasks, but not in homophilous node classification tasks.
翻译:理解神经网络的常见理论方法是采用无限宽极限,此时网络输出服从高斯过程分布,即神经网络高斯过程。然而,NNGP核是固定的,仅能通过少量超参数进行调节,完全排除了表征学习的可能性。这与有限宽神经网络形成鲜明对比——后者通常被认为之所以表现优异,正是因为其具备表征学习能力。因此,在简化神经网络以实现理论可解性的过程中,NNGP可能恰恰消除了使其有效运作的关键机制(表征学习)。这一观察促使我们探究表征学习在各类图分类任务中是否不可或缺。我们为此开发了精准工具——图卷积深度核机器。该模型与NNGP高度相似,同为无限宽极限且基于核方法,但额外配备了可调节表征学习强度的“控制旋钮”。实验发现:在图分类任务和异质性节点分类任务中,表征学习至关重要(能带来显著的性能提升);但在同质性节点分类任务中并非必要。