We study the node classification problem on feature-decorated graphs in the sparse setting, i.e., when the expected degree of a node is $O(1)$ in the number of nodes. Such graphs are typically known to be locally tree-like. We introduce a notion of Bayes optimality for node classification tasks, called asymptotic local Bayes optimality, and compute the optimal classifier according to this criterion for a fairly general statistical data model with arbitrary distributions of the node features and edge connectivity. The optimal classifier is implementable using a message-passing graph neural network architecture. We then compute the generalization error of this classifier and compare its performance against existing learning methods theoretically on a well-studied statistical model with naturally identifiable signal-to-noise ratios (SNRs) in the data. We find that the optimal message-passing architecture interpolates between a standard MLP in the regime of low graph signal and a typical convolution in the regime of high graph signal. Furthermore, we prove a corresponding non-asymptotic result.
翻译:我们研究稀疏设定下特征装饰图的节点分类问题,即节点期望度数为$O(1)$(相对于节点数量)的情形。此类图通常具有局部树状结构。我们提出节点分类任务的贝叶斯最优性概念——渐近局部贝叶斯最优性,并针对一个相当通用的统计数据模型(包含任意节点特征分布与边连接分布)计算该准则下的最优分类器。该最优分类器可通过消息传递图神经网络架构实现。随后,我们计算该分类器的泛化误差,并在一个具有天然可辨识信噪比(SNR)数据的经典统计模型上,将其性能与现有学习方法进行理论比较。研究发现,最优消息传递架构会在低图信号区域的标准MLP与高图信号区域的典型卷积之间进行插值。此外,我们证明了相应的非渐近结果。