Graph Neural Networks (GNNs) are powerful tools for addressing learning problems on graph structures, with a wide range of applications in molecular biology and social networks. However, the theoretical foundations underlying their empirical performance are not well understood. In this article, we examine the convergence of gradient dynamics in the training of linear GNNs. Specifically, we prove that the gradient flow training of a linear GNN with mean squared loss converges to the global minimum at an exponential rate. The convergence rate depends explicitly on the initial weights and the graph shift operator, which we validate on synthetic datasets from well-known graph models and real-world datasets. Furthermore, we discuss the gradient flow that minimizes the total weights at the global minimum. In addition to the gradient flow, we study the convergence of linear GNNs under gradient descent training, an iterative scheme viewed as a discretization of gradient flow.
翻译:图神经网络(GNNs)是解决图结构学习问题的强大工具,在分子生物学和社交网络等领域具有广泛应用。然而,其经验性能背后的理论基础尚未得到充分理解。本文研究了线性GNN训练中梯度动态的收敛特性。具体而言,我们证明了采用均方误差损失的线性GNN在梯度流训练下能以指数速率收敛至全局最小值。收敛速率显式依赖于初始权重与图移位算子,这一结论在经典图模型生成的合成数据集及真实数据集上得到了验证。此外,我们探讨了在全局最小值处最小化总权重的梯度流行为。除梯度流外,本文还研究了梯度下降训练(作为梯度流离散化迭代方案)下线性GNN的收敛性。