SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking

Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge the gap, its performance on unsupervised learning tasks is still unknown. In this paper, to comprehensively enhance the performance of generative graph SSL against other GCL models on both unsupervised and supervised learning tasks, we propose the SeeGera model, which is based on the family of self-supervised variational graph auto-encoder (VGAE). Specifically, SeeGera adopts the semi-implicit variational inference framework, a hierarchical variational framework, and mainly focuses on feature reconstruction and structure/feature masking. On the one hand, SeeGera co-embeds both nodes and features in the encoder and reconstructs both links and features in the decoder. Since feature embeddings contain rich semantic information on features, they can be combined with node embeddings to provide fine-grained knowledge for feature reconstruction. On the other hand, SeeGera adds an additional layer for structure/feature masking to the hierarchical variational framework, which boosts the model generalizability. We conduct extensive experiments comparing SeeGera with 9 other state-of-the-art competitors. Our results show that SeeGera can compare favorably against other state-of-the-art GCL methods in a variety of unsupervised and supervised learning tasks.

翻译：生成式图自监督学习旨在通过重构输入图数据来学习节点表示。然而，现有方法大多仅聚焦于无监督学习任务，鲜有工作能展现出其相对于最先进的图对比学习模型（尤其是在分类任务上）的优越性。尽管近期有模型尝试弥合这一差距，但其在无监督学习任务上的表现仍属未知。为全面增强生成式图自监督学习在无监督与有监督学习任务中相较于其他GCL模型的性能，本文提出了基于自监督变分图自编码器（VGAE）家族的SeeGera模型。具体而言，SeeGera采用半隐式变分推断框架——一种层次化变分框架，并重点关注特征重构与结构/特征掩码。一方面，SeeGera在编码器中联合嵌入节点与特征，并在解码器中同时重构连边与特征。由于特征嵌入包含丰富的特征语义信息，其可与节点嵌入结合，为特征重构提供细粒度知识。另一方面，SeeGera在层次化变分框架中新增了结构/特征掩码层，从而增强了模型泛化能力。我们开展了大量实验，将SeeGera与9种最先进的对比方法进行比较。结果表明，SeeGera在各类无监督与有监督学习任务中均能取得与最先进GCL方法相媲美甚至更优的性能。