Graph Neural Network (GNN) with its ability to integrate graph information has been widely used for data analyses. However, the expressive power of GNN has only been studied for graph-level tasks but not for node-level tasks, such as node classification, where one tries to interpolate missing nodal labels from the observed ones. In this paper, we study the expressive power of GNN for the said classification task, which is in essence a function interpolation problem. Explicitly, we derive the number of weights and layers needed for a GNN to interpolate a band-limited function in $\mathbb{R}^d$. Our result shows that, the number of weights needed to $\epsilon$-approximate a bandlimited function using the GNN architecture is much fewer than the best known one using a fully connected neural network (NN) - in particular, one only needs $O((\log \epsilon^{-1})^{d})$ weights using a GNN trained by $O((\log \epsilon^{-1})^{d})$ samples to $\epsilon$-approximate a discretized bandlimited signal in $\mathbb{R}^d$. The result is obtained by drawing a connection between the GNN structure and the classical sampling theorems, making our work the first attempt in this direction.
翻译:图神经网络(GNN)凭借其整合图信息的能力,已被广泛应用于数据分析。然而,GNN的表达能力研究此前仅局限于图级任务,尚未涉及节点级任务(如节点分类),这类任务旨在根据观测标签插值缺失的节点标签。本文针对此类本质为函数插值问题的分类任务,研究了GNN的表达能力。具体而言,我们推导了GNN在$\mathbb{R}^d$中对带限函数进行插值所需的权重与层数。结果表明:相较于已知最优的全连接神经网络(NN),GNN架构实现带限函数的$\epsilon$-近似所需的权重数量显著更少——具体而言,仅需$O((\log \epsilon^{-1})^{d})$个权重及$O((\log \epsilon^{-1})^{d})$个训练样本,即可在$\mathbb{R}^d$中对离散化带限信号实现$\epsilon$-近似。该结论通过建立GNN结构与经典采样定理之间的联系得出,本文是该方向上的首次尝试。