Graph contrastive learning (GCL) has become a powerful tool for learning graph data, but its scalability remains a significant challenge. In this work, we propose a simple yet effective training framework called Structural Compression (StructComp) to address this issue. Inspired by a sparse low-rank approximation on the diffusion matrix, StructComp trains the encoder with the compressed nodes. This allows the encoder not to perform any message passing during the training stage, and significantly reduces the number of sample pairs in the contrastive loss. We theoretically prove that the original GCL loss can be approximated with the contrastive loss computed by StructComp. Moreover, StructComp can be regarded as an additional regularization term for GCL models, resulting in a more robust encoder. Empirical studies on various datasets show that StructComp greatly reduces the time and memory consumption while improving model performance compared to the vanilla GCL models and scalable training methods.
翻译:摘要:图对比学习(GCL)已成为处理图数据的有力工具,但其可扩展性仍是一项重大挑战。本文提出一种简洁高效的训练框架——结构压缩(StructComp),以解决该问题。受扩散矩阵的稀疏低秩近似启发,StructComp通过压缩节点训练编码器,使编码器在训练阶段无需执行任何消息传递,并显著减少了对比损失中的样本对数量。我们从理论上证明,原始GCL损失可通过StructComp计算的对比损失近似。此外,StructComp可视为GCL模型的额外正则化项,从而增强编码器的鲁棒性。在多个数据集上的实证研究表明,相较于标准GCL模型与可扩展训练方法,StructComp大幅降低了时间与内存消耗,同时提升了模型性能。