For graph self-supervised learning (GSSL), masked autoencoder (MAE) follows the generative paradigm and learns to reconstruct masked graph edges or node features. Contrastive Learning (CL) maximizes the similarity between augmented views of the same graph and is widely used for GSSL. However, MAE and CL are considered separately in existing works for GSSL. We observe that the MAE and CL paradigms are complementary and propose the graph contrastive masked autoencoder (GCMAE) framework to unify them. Specifically, by focusing on local edges or node features, MAE cannot capture global information of the graph and is sensitive to particular edges and features. On the contrary, CL excels in extracting global information because it considers the relation between graphs. As such, we equip GCMAE with an MAE branch and a CL branch, and the two branches share a common encoder, which allows the MAE branch to exploit the global information extracted by the CL branch. To force GCMAE to capture global graph structures, we train it to reconstruct the entire adjacency matrix instead of only the masked edges as in existing works. Moreover, a discrimination loss is proposed for feature reconstruction, which improves the disparity between node embeddings rather than reducing the reconstruction error to tackle the feature smoothing problem of MAE. We evaluate GCMAE on four popular graph tasks (i.e., node classification, node clustering, link prediction, and graph classification) and compare with 14 state-of-the-art baselines. The results show that GCMAE consistently provides good accuracy across these tasks, and the maximum accuracy improvement is up to 3.2% compared with the best-performing baseline.
翻译:对于图自监督学习(GSSL),掩码自编码器(MAE)遵循生成式范式,学习重建被掩码的图边或节点特征。对比学习(CL)则最大化同一图增强视图间的相似性,已被广泛用于GSSL。然而,现有GSSL研究将MAE与CL视为独立方法。我们发现MAE与CL范式具有互补性,由此提出图对比掩码自编码器(GCMAE)框架以统一二者。具体而言,MAE因聚焦局部边或节点特征,难以捕捉图的全局信息,且对特定边和特征敏感。相反,CL因考量图间关系而擅长提取全局信息。为此,我们为GCMAE配备MAE分支与CL分支,两分支共享编码器,使MAE分支能利用CL分支提取的全局信息。为迫使GCMAE捕获全局图结构,我们训练其重建完整邻接矩阵(而非现有方法仅重建掩码边)。此外,针对特征重建提出判别损失,通过提升节点嵌入的差异性(而非降低重建误差)来缓解MAE的特征平滑问题。我们在四大经典图任务(节点分类、节点聚类、链路预测、图分类)上评估GCMAE,并与14个顶尖基线方法对比。结果表明,GCMAE在各任务中均能稳定取得高精度,相较于最优基线,最大精度提升达3.2%。