Graph neural networks have pushed state-of-the-arts in graph classifications recently. Typically, these methods are studied within the context of supervised end-to-end training, which necessities copious task-specific labels. However, in real-world circumstances, labeled data could be limited, and there could be a massive corpus of unlabeled data, even from unknown classes as a complementary. Towards this end, we study the problem of semi-supervised universal graph classification, which not only identifies graph samples which do not belong to known classes, but also classifies the remaining samples into their respective classes. This problem is challenging due to a severe lack of labels and potential class shifts. In this paper, we propose a novel graph neural network framework named UGNN, which makes the best of unlabeled data from the subgraph perspective. To tackle class shifts, we estimate the certainty of unlabeled graphs using multiple subgraphs, which facilities the discovery of unlabeled data from unknown categories. Moreover, we construct semantic prototypes in the embedding space for both known and unknown categories and utilize posterior prototype assignments inferred from the Sinkhorn-Knopp algorithm to learn from abundant unlabeled graphs across different subgraph views. Extensive experiments on six datasets verify the effectiveness of UGNN in different settings.
翻译:图神经网络在最近的图分类任务中推动了性能的前沿。通常,这些方法在监督端到端训练的背景下进行研究,这需要大量任务特定的标签。然而,在现实场景中,标记数据可能有限,同时存在大量未标记数据,甚至包括来自未知类别的数据作为补充。为此,我们研究了半监督通用图分类问题,该问题不仅需要识别不属于已知类别的图样本,还需将剩余样本分类到各自类别中。由于标签严重匮乏和潜在的类别偏移,该问题具有挑战性。本文提出了一种新颖的图神经网络框架UGNN,该框架从子图视角充分利用未标记数据。为应对类别偏移,我们利用多个子图估计未标记图的确定性,从而促进从未知类别中发现未标记数据。此外,我们在嵌入空间中为已知和未知类别构建语义原型,并利用Sinkhorn-Knopp算法推断的后验原型分配,从不同子图视图中的大量未标记图中学习。在六个数据集上的大量实验验证了UGNN在不同设置下的有效性。