This paper studies structured node classification on graphs, where the predictions should consider dependencies between the node labels. In particular, we focus on solving the problem for partially labeled graphs where it is essential to incorporate the information in the known label for predicting the unknown labels. To address this issue, we propose a novel framework leveraging the diffusion probabilistic model for structured node classification (DPM-SNC). At the heart of our framework is the extraordinary capability of DPM-SNC to (a) learn a joint distribution over the labels with an expressive reverse diffusion process and (b) make predictions conditioned on the known labels utilizing manifold-constrained sampling. Since the DPMs lack training algorithms for partially labeled data, we design a novel training algorithm to apply DPMs, maximizing a new variational lower bound. We also theoretically analyze how DPMs benefit node classification by enhancing the expressive power of GNNs based on proposing AGG-WL, which is strictly more powerful than the classic 1-WL test. We extensively verify the superiority of our DPM-SNC in diverse scenarios, which include not only the transductive setting on partially labeled graphs but also the inductive setting and unlabeled graphs.
翻译:本文研究图上的结构化节点分类问题,其中预测需考虑节点标签间的依赖关系。我们特别聚焦于解决部分标注图中的问题,即必须利用已知标签信息来预测未知标签。为此,我们提出了一种基于扩散概率模型的结构化节点分类新框架(DPM-SNC)。该框架的核心在于DPM-SNC具备两大卓越能力:(a) 通过富有表现力的逆向扩散过程学习标签的联合分布,以及(b) 利用流形约束采样,基于已知标签进行条件预测。由于扩散概率模型缺乏针对部分标注数据的训练算法,我们设计了一种新的训练算法来应用扩散概率模型,最大化新的变分下界。我们还通过提出AGG-WL(该测试严格强于经典1-WL测试),从理论上分析了扩散概率模型如何通过增强图神经网络的表达能力来促进节点分类。我们在多种场景下充分验证了DPM-SNC的优越性,这些场景不仅包括部分标注图上的直推式设置,还涵盖归纳式设置和无标注图。