Graph Neural Network (GNN) with its ability to integrate graph information has been widely used for data analyses. However, the expressive power of GNN has only been studied for graph-level tasks but not for node-level tasks, such as node classification, where one tries to interpolate missing nodal labels from the observed ones. In this paper, we study the expressive power of GNN for the said classification task, which is in essence a function interpolation problem. Explicitly, we derive the number of weights and layers needed for a GNN to interpolate a band-limited function in $\mathbb{R}^d$. Our result shows that, the number of weights needed to $\epsilon$-approximate a bandlimited function using the GNN architecture is much fewer than the best known one using a fully connected neural network (NN) - in particular, one only needs $O((\log \epsilon^{-1})^{d})$ weights using a GNN trained by $O((\log \epsilon^{-1})^{d})$ samples to $\epsilon$-approximate a discretized bandlimited signal in $\mathbb{R}^d$. The result is obtained by drawing a connection between the GNN structure and the classical sampling theorems, making our work the first attempt in this direction.
翻译:图神经网络(GNN)凭借其整合图信息的能力,已被广泛应用于数据分析。然而,GNN的表达能力此前仅在图级任务中得到研究,而未涉及节点级任务(如节点分类),其中需要根据观测到的节点标签插值缺失的节点标签。本文针对上述分类任务(本质上是函数插值问题)研究了GNN的表达能力。具体而言,我们推导了GNN在$\mathbb{R}^d$中插值带限函数所需的权重和层数。结果表明,与已知最优的全连接神经网络(NN)相比,使用GNN架构$\epsilon$-逼近带限函数所需的权重数量要少得多——特别是,通过$O((\log \epsilon^{-1})^{d})$个样本训练的GNN仅需$O((\log \epsilon^{-1})^{d})$个权重即可在$\mathbb{R}^d$中$\epsilon$-逼近离散化带限信号。该结论通过建立GNN结构与经典采样定理之间的关联得出,本文是该方向上的首次尝试。