Graph Contrastive Learning (GCL) has emerged as a promising approach in the realm of graph self-supervised learning. Prevailing GCL methods mainly derive from the principles of contrastive learning in the field of computer vision: modeling invariance by specifying absolutely similar pairs. However, when applied to graph data, this paradigm encounters two significant limitations: (1) the validity of the generated views cannot be guaranteed: graph perturbation may produce invalid views against semantics and intrinsic topology of graph data; (2) specifying absolutely similar pairs in the graph views is unreliable: for abstract and non-Euclidean graph data, it is difficult for humans to decide the absolute similarity and dissimilarity intuitively. Despite the notable performance of current GCL methods, these challenges necessitate a reevaluation: Could GCL be more effectively tailored to the intrinsic properties of graphs, rather than merely adopting principles from computer vision? In response to this query, we propose a novel paradigm, Graph Soft-Contrastive Learning (GSCL). This approach facilitates GCL via neighborhood ranking, avoiding the need to specify absolutely similar pairs. GSCL leverages the underlying graph characteristic of diminishing label consistency, asserting that nodes that are closer in the graph are overall more similar than far-distant nodes. Within the GSCL framework, we introduce pairwise and listwise gated ranking InfoNCE loss functions to effectively preserve the relative similarity ranking within neighborhoods. Moreover, as the neighborhood size exponentially expands with more hops considered, we propose neighborhood sampling strategies to improve learning efficiency. Our extensive empirical results across 11 commonly used graph datasets-including 8 homophily graphs and 3 heterophily graphs-demonstrate GSCL's superior performance compared to 20 SOTA GCL methods.
翻译:图对比学习(Graph Contrastive Learning, GCL)已成为图自监督学习领域中一种极具前景的方法。主流的GCL方法主要源自计算机视觉领域中对比学习的原理:通过指定绝对相似对来建模不变性。然而,当应用于图数据时,这一范式面临两个显著限制:(1)生成的视图有效性无法保证:图扰动可能产生违背图数据语义和内在拓扑结构的无效视图;(2)在图视图中指定绝对相似对并不可靠:对于抽象且非欧几里得的图数据,人类难以直观判断绝对的相似性与不相似性。尽管当前GCL方法性能显著,但这些挑战促使我们重新思考:能否使GCL更有效地适配图的固有属性,而非仅仅沿用计算机视觉的原理?针对这一问题,我们提出了一种新颖的范式——图软对比学习(Graph Soft-Contrastive Learning, GSCL)。该方法通过邻域排序实现GCL,无需指定绝对相似对。GSCL利用了标签一致性递减这一底层图特征,断言图中距离较近的节点整体上比距离较远的节点更为相似。在GSCL框架内,我们引入了成对和列表式的门控排序InfoNCE损失函数,以有效保留邻域内的相对相似性排序。此外,由于随着考虑跳数的增加,邻域规模呈指数级增长,我们提出了邻域采样策略以提高学习效率。我们在11个常用图数据集(包括8个同质图与3个异质图)上的大量实验结果表明,相比20种当前最优的GCL方法,GSCL具有更优越的性能。