Representation learning on text-attributed graphs (TAGs) is vital for real-world applications, as they combine semantic textual and contextual structural information. Research in this field generally consist of two main perspectives: local-level encoding and global-level aggregating, respectively refer to textual node information unification (e.g., using Language Models) and structure-augmented modeling (e.g., using Graph Neural Networks). Most existing works focus on combining different information levels but overlook the interconnections, i.e., the contextual textual information among nodes, which provides semantic insights to bridge local and global levels. In this paper, we propose GraphBridge, a multi-granularity integration framework that bridges local and global perspectives by leveraging contextual textual information, enhancing fine-grained understanding of TAGs. Besides, to tackle scalability and efficiency challenges, we introduce a graphaware token reduction module. Extensive experiments across various models and datasets show that our method achieves state-of-theart performance, while our graph-aware token reduction module significantly enhances efficiency and solves scalability issues.
翻译:文本属性图(TAG)的表示学习对于现实应用至关重要,因为它融合了语义文本信息和上下文结构信息。该领域的研究通常包含两个主要视角:局部级编码与全局级聚合,分别指文本节点信息统一(例如使用语言模型)和结构增强建模(例如使用图神经网络)。现有工作大多聚焦于结合不同信息层级,却忽视了节点间的相互关联,即节点间的上下文文本信息,这种信息为连接局部与全局层级提供了语义洞察。本文提出GraphBridge,一种多粒度集成框架,通过利用上下文文本信息来桥接局部与全局视角,从而增强对文本属性图的细粒度理解。此外,为应对可扩展性与效率挑战,我们引入了图感知令牌约简模块。在不同模型与数据集上的大量实验表明,我们的方法实现了最先进的性能,同时图感知令牌约简模块显著提升了效率并解决了可扩展性问题。