Graph Convolutional Networks (GCNs) has demonstrated promising results for recommender systems, as they can effectively leverage high-order relationship. However, these methods usually encounter data sparsity issue in real-world scenarios. To address this issue, GCN-based recommendation methods employ contrastive learning to introduce self-supervised signals. Despite their effectiveness, these methods lack consideration of the significant degree disparity between head and tail nodes. This can lead to non-uniform representation distribution, which is a crucial factor for the performance of contrastive learning methods. To tackle the above issue, we propose a novel Long-tail Augmented Graph Contrastive Learning (LAGCL) method for recommendation. Specifically, we introduce a learnable long-tail augmentation approach to enhance tail nodes by supplementing predicted neighbor information, and generate contrastive views based on the resulting augmented graph. To make the data augmentation schema learnable, we design an auto drop module to generate pseudo-tail nodes from head nodes and a knowledge transfer module to reconstruct the head nodes from pseudo-tail nodes. Additionally, we employ generative adversarial networks to ensure that the distribution of the generated tail/head nodes matches that of the original tail/head nodes. Extensive experiments conducted on three benchmark datasets demonstrate the significant improvement in performance of our model over the state-of-the-arts. Further analyses demonstrate the uniformity of learned representations and the superiority of LAGCL on long-tail performance. Code is publicly available at https://github.com/im0qianqian/LAGCL
翻译:图卷积网络(GCNs)因其能有效利用高阶关系,在推荐系统中展现出显著潜力。然而,这些方法在实际场景中常面临数据稀疏性问题。为解决该问题,基于GCN的推荐方法采用对比学习引入自监督信号。尽管这些方法有效,但缺乏对头部节点与尾部节点之间显著度数差异的考量,这可能导致表示分布不均匀,而该分布是影响对比学习方法性能的关键因素。为此,我们提出了一种新颖的面向推荐的长尾增强图对比学习(LAGCL)方法。具体而言,我们引入一种可学习的长尾增强方法,通过补充预测的邻居信息来增强尾部节点,并基于增强后的图生成对比视图。为使数据增强方案可学习,我们设计了自动丢弃模块以从头部节点生成伪尾部节点,以及知识迁移模块以从伪尾部节点重建头部节点。此外,我们采用生成对抗网络确保生成的尾部/头部节点分布与原始尾部/头部节点分布相匹配。在三个基准数据集上进行的广泛实验表明,我们的模型性能较现有最优方法有显著提升。进一步分析证明了所学表示的均匀性及LAGCL在长尾性能上的优越性。代码已开源:https://github.com/im0qianqian/LAGCL