Graph contrastive learning (GCL) has become a powerful tool for learning graph data, but its scalability remains a significant challenge. In this work, we propose a simple yet effective training framework called Structural Compression (StructComp) to address this issue. Inspired by a sparse low-rank approximation on the diffusion matrix, StructComp trains the encoder with the compressed nodes. This allows the encoder not to perform any message passing during the training stage, and significantly reduces the number of sample pairs in the contrastive loss. We theoretically prove that the original GCL loss can be approximated with the contrastive loss computed by StructComp. Moreover, StructComp can be regarded as an additional regularization term for GCL models, resulting in a more robust encoder. Empirical studies on seven benchmark datasets show that StructComp greatly reduces the time and memory consumption while improving model performance compared to the vanilla GCL models and scalable training methods.
翻译:图对比学习已成为学习图数据的强大工具,但其可扩展性仍是重大挑战。本文提出一种简单而有效的训练框架——结构压缩(StructComp)来解决这一问题。受扩散矩阵上稀疏低秩近似的启发,StructComp利用压缩节点训练编码器,使编码器在训练阶段无需执行任何消息传递,并显著减少对比损失中的样本对数量。我们从理论上证明,原始GCL损失可通过StructComp计算的对比损失近似。此外,StructComp可视为GCL模型的附加正则化项,从而生成更鲁棒的编码器。在七个基准数据集上的实验表明,与原始GCL模型及可扩展训练方法相比,StructComp在提升模型性能的同时大幅降低了时间和内存消耗。