This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are far from satisfactory due to their insufficient topology exploration of unlabeled data. We address the challenge by proposing a novel semi-supervised framework called Twin Graph Neural Network (TGNN). To explore graph structural information from complementary views, our TGNN has a message passing module and a graph kernel module. To fully utilize unlabeled data, for each module, we calculate the similarity of each unlabeled graph to other labeled graphs in the memory bank and our consistency loss encourages consistency between two similarity distributions in different embedding spaces. The two twin modules collaborate with each other by exchanging instance similarity knowledge to fully explore the structure information of both labeled and unlabeled data. We evaluate our TGNN on various public datasets and show that it achieves strong performance.
翻译:本文研究半监督图分类问题,该任务是社会网络分析与生物信息学中具有广泛应用的关键任务。现有方法通常采用图神经网络学习图级表示进行分类,但未能显式利用图拓扑结构衍生特征(如路径)。此外,当标注数据稀缺时,由于对未标注数据拓扑信息探索不足,这些方法表现远非理想。我们通过提出名为孪生图神经网络(TGNN)的新型半监督框架来应对这一挑战。为从互补视角探索图结构信息,TGNN包含消息传递模块与图核模块。为充分利用未标注数据,针对每个模块,我们计算内存库中每个未标注图与其他标注图的相似度,并通过一致性损失促使两个模块在不同嵌入空间中的相似度分布保持一致。两个孪生模块通过交换实例相似度知识进行协作,从而充分挖掘标注与未标注数据的结构信息。我们在多个公开数据集上评估TGNN,实验表明其取得了优异的性能。