Graph contrastive learning (GCL) has emerged as a state-of-the-art strategy for learning representations of diverse graphs including social and biomedical networks. GCL widely uses stochastic graph topology augmentation, such as uniform node dropping, to generate augmented graphs. However, such stochastic augmentations may severely damage the intrinsic properties of a graph and deteriorate the following representation learning process. We argue that incorporating an awareness of cohesive subgraphs during the graph augmentation and learning processes has the potential to enhance GCL performance. To this end, we propose a novel unified framework called CTAug, to seamlessly integrate cohesion awareness into various existing GCL mechanisms. In particular, CTAug comprises two specialized modules: topology augmentation enhancement and graph learning enhancement. The former module generates augmented graphs that carefully preserve cohesion properties, while the latter module bolsters the graph encoder's ability to discern subgraph patterns. Theoretical analysis shows that CTAug can strictly improve existing GCL mechanisms. Empirical experiments verify that CTAug can achieve state-of-the-art performance for graph representation learning, especially for graphs with high degrees. The code is available at https://doi.org/10.5281/zenodo.10594093, or https://github.com/wuyucheng2002/CTAug.
翻译:图对比学习(GCL)已成为学习包括社交网络和生物医学网络在内的多样图表示的前沿策略。GCL广泛采用随机图拓扑增强(如均匀节点删除)来生成增强图。然而,这种随机增强可能严重破坏图的固有属性,进而损害后续的表示学习过程。我们认为,在图增强和学习过程中融入对内聚子图的感知有可能提升GCL的性能。为此,我们提出一个名为CTAug的新型统一框架,旨在将内聚性感知无缝融入各种现有GCL机制。具体而言,CTAug包含两个专门模块:拓扑增强增强模块和图学习增强模块。前者生成精心保留内聚属性的增强图,后者则强化图编码器识别子图模式的能力。理论分析表明,CTAug能够严格改进现有GCL机制。实验验证了CTAug在图表示学习方面(尤其对于高度数图)可达到前沿性能。代码可访问https://doi.org/10.5281/zenodo.10594093或https://github.com/wuyucheng2002/CTAug获取。