In this work, we discover a phenomenon of community bias amplification in graph representation learning, which refers to the exacerbation of performance bias between different classes by graph representation learning. We conduct an in-depth theoretical study of this phenomenon from a novel spectral perspective. Our analysis suggests that structural bias between communities results in varying local convergence speeds for node embeddings. This phenomenon leads to bias amplification in the classification results of downstream tasks. Based on the theoretical insights, we propose random graph coarsening, which is proved to be effective in dealing with the above issue. Finally, we propose a novel graph contrastive learning model called Random Graph Coarsening Contrastive Learning (RGCCL), which utilizes random coarsening as data augmentation and mitigates community bias by contrasting the coarsened graph with the original graph. Extensive experiments on various datasets demonstrate the advantage of our method when dealing with community bias amplification.
翻译:本文发现图表示学习中存在社区偏差放大现象,即图表示学习会加剧不同类别之间的性能偏差。我们从新颖的谱域视角对该现象进行了深入的理论研究。分析表明,社区间的结构偏差会导致节点嵌入在局部收敛速度上存在差异,这一现象进而引发下游任务分类结果中的偏差放大。基于理论洞察,我们提出随机图粗化方法,并证明其能有效解决上述问题。最终,我们提出一种新型图对比学习模型——随机图粗化对比学习(RGCCL),该模型利用随机粗化作为数据增强手段,通过对比粗化图与原图来缓解社区偏差。在多个数据集上的大量实验表明,我们的方法在处理社区偏差放大问题方面具有显著优势。