Graph contrastive learning has emerged as a powerful tool for unsupervised graph representation learning. The key to the success of graph contrastive learning is to acquire high-quality positive and negative samples as contrasting pairs for the purpose of learning underlying structural semantics of the input graph. Recent works usually sample negative samples from the same training batch with the positive samples, or from an external irrelevant graph. However, a significant limitation lies in such strategies, which is the unavoidable problem of sampling false negative samples. In this paper, we propose a novel method to utilize \textbf{C}ounterfactual mechanism to generate artificial hard negative samples for \textbf{G}raph \textbf{C}ontrastive learning, namely \textbf{CGC}, which has a different perspective compared to those sampling-based strategies. We utilize counterfactual mechanism to produce hard negative samples, which ensures that the generated samples are similar to, but have labels that different from the positive sample. The proposed method achieves satisfying results on several datasets compared to some traditional unsupervised graph learning methods and some SOTA graph contrastive learning methods. We also conduct some supplementary experiments to give an extensive illustration of the proposed method, including the performances of CGC with different hard negative samples and evaluations for hard negative samples generated with different similarity measurements.
翻译:图对比学习已成为无监督图表示学习的强大工具。其成功的关键在于获取高质量的正负样本作为对比对,以学习输入图的底层结构语义。近期研究通常从同一训练批次的正样本中或外部无关图中采样负样本。然而,这种策略存在一个显著局限——即不可避免的假负样本采样问题。本文提出了一种新颖方法,利用反事实机制为图对比学习生成人工硬负样本,名为CGC,这为基于采样的策略提供了不同视角。我们通过反事实机制生成硬负样本,确保生成的样本与正样本相似但标签不同。与多种传统无监督图学习方法及当前最优的图对比学习方法相比,本方法在多个数据集上取得了令人满意的结果。我们还通过补充实验对提出的方法进行了广泛说明,包括CGC在不同硬负样本下的性能表现,以及基于不同相似度度量生成的硬负样本评估。