In the domain of recommendation and collaborative filtering, Graph Contrastive Learning (GCL) has become an influential approach. Nevertheless, the reasons for the effectiveness of contrastive learning are still not well understood. In this paper, we challenge the conventional use of random augmentations on graph structure or embedding space in GCL, which may disrupt the structural and semantic information inherent in Graph Neural Networks. Moreover, fixed-rate data augmentation proves to be less effective compared to augmentation with an adaptive rate. In the initial training phases, significant perturbations are more suitable, while as the training approaches convergence, milder perturbations yield better results. We introduce a twin encoder in place of random augmentations, demonstrating the redundancy of traditional augmentation techniques. The twin encoder updating mechanism ensures the generation of more diverse contrastive views in the early stages, transitioning to views with greater similarity as training progresses. In addition, we investigate the learned representations from the perspective of alignment and uniformity on a hypersphere to optimize more efficiently. Our proposed Twin Graph Contrastive Learning model -- TwinCL -- aligns positive pairs of user and item embeddings and the representations from the twin encoder while maintaining the uniformity of the embeddings on the hypersphere. Our theoretical analysis and experimental results show that the proposed model optimizing alignment and uniformity with the twin encoder contributes to better recommendation accuracy and training efficiency performance. In comprehensive experiments on three public datasets, our proposed TwinCL achieves an average improvement of 5.6% (NDCG@10) in recommendation accuracy with faster training speed, while effectively mitigating popularity bias.
翻译:在推荐与协同过滤领域,图对比学习已成为一种具有影响力的方法。然而,对比学习有效的原因仍未得到充分理解。本文质疑了传统图对比学习中在图结构或嵌入空间上使用随机增强的做法,这种增强可能会破坏图神经网络固有的结构与语义信息。此外,固定比例的数据增强相较于自适应比例的增强效果较差。在训练初期,较大的扰动更为合适;而当训练接近收敛时,较温和的扰动能产生更好的结果。我们引入了一个双编码器以替代随机增强,证明了传统增强技术的冗余性。双编码器更新机制确保了在训练早期生成更多样化的对比视图,并随着训练进程逐步过渡到相似度更高的视图。此外,我们从超球面上的对齐性与均匀性角度研究了学习到的表示,以实现更高效的优化。我们提出的双图对比学习模型——TwinCL——在保持嵌入在超球面上均匀性的同时,对齐用户与物品嵌入的正样本对以及来自双编码器的表示。我们的理论分析与实验结果表明,所提出的模型通过双编码器优化对齐性与均匀性,有助于提升推荐准确性与训练效率。在三个公开数据集上的综合实验中,我们提出的TwinCL在推荐准确性上实现了平均5.6%(NDCG@10)的提升,同时训练速度更快,并能有效缓解流行度偏差。