GCL-OT: Graph Contrastive Learning with Optimal Transport for Heterophilic Text-Attributed Graphs

Recently, structure-text contrastive learning has shown promising performance on text-attributed graphs by leveraging the complementary strengths of graph neural networks and language models. However, existing methods typically rely on homophily assumptions in similarity estimation and hard optimization objectives, which limit their applicability to heterophilic graphs. Although existing methods can mitigate heterophily through structural adjustments or neighbor aggregation, they usually treat textual embeddings as static targets, leading to suboptimal alignment. In this work, we identify multi-granular heterophily in text-attributed graphs, including complete heterophily, partial heterophily, and latent homophily, which makes structure-text alignment particularly challenging due to mixed, noisy, and missing semantic correlations. To achieve flexible and bidirectional alignment, we propose GCL-OT, a novel graph contrastive learning framework with optimal transport, equipped with tailored mechanisms for each type of heterophily. Specifically, for partial heterophily, we design a RealSoftMax-based similarity estimator to emphasize key neighbor-word interactions while easing background noise. For complete heterophily, we introduce a prompt-based filter that adaptively excludes irrelevant noise during optimal transport alignment. Furthermore, we incorporate OT-guided soft supervision to uncover potential neighbors with similar semantics, enhancing the learning of latent homophily. Theoretical analysis shows that GCL-OT can improve the mutual information bound and Bayes error guarantees. Extensive experiments on nine benchmarks show that GCL-OT outperforms state-of-the-art methods, demonstrating its effectiveness and robustness.

翻译：近年来，结构-文本对比学习通过结合图神经网络与语言模型的互补优势，在文本属性图上展现出有前景的性能。然而，现有方法通常在相似性估计和硬优化目标中依赖同配性假设，这限制了其在异配图上的适用性。尽管现有方法可以通过结构调整或邻居聚合来缓解异配性，但它们通常将文本嵌入视为静态目标，导致对齐效果欠佳。在本工作中，我们识别了文本属性图中的多粒度异配性，包括完全异配性、部分异配性和潜在同配性，这使得由于语义关联的混合、嘈杂和缺失，结构-文本对齐尤为困难。为实现灵活且双向的对齐，我们提出了GCL-OT，一种基于最优传输的新型图对比学习框架，并为每种异配性类型配备了定制机制。具体而言，针对部分异配性，我们设计了一种基于RealSoftMax的相似性估计器，以强调关键邻居-词语交互，同时缓解背景噪声。对于完全异配性，我们引入了一种基于提示的过滤器，能在最优传输对齐过程中自适应地排除无关噪声。此外，我们融入了OT引导的软监督，以发现具有相似语义的潜在邻居，从而增强对潜在同配性的学习。理论分析表明，GCL-OT能够提升互信息下界并改善贝叶斯误差保证。在九个基准数据集上的大量实验表明，GCL-OT优于现有最先进方法，证明了其有效性和鲁棒性。