Text-attributed graphs (TAGs) present unique challenges in representation learning by requiring models to capture both the semantic richness of node-associated texts and the structural dependencies of the graph. While graph neural networks (GNNs) excel at modeling topological information, they lack the capacity to process unstructured text. Conversely, large language models (LLMs) are proficient in text understanding but are typically unaware of graph structure. In this work, we propose BiGTex (Bidirectional Graph Text), a novel architecture that tightly integrates GNNs and LLMs through stacked Graph-Text Fusion Units. Each unit allows for mutual attention between textual and structural representations, enabling information to flow in both directions, text influencing structure and structure guiding textual interpretation. The proposed architecture is trained using parameter-efficient fine-tuning (LoRA), keeping the LLM frozen while adapting to task-specific signals. Extensive experiments on five benchmark datasets demonstrate that BiGTex achieves state-of-the-art performance in node classification and generalizes effectively to link prediction. An ablation study further highlights the importance of soft prompting and bi-directional attention in the model's success.
翻译:文本属性图在表示学习中提出了独特挑战,要求模型同时捕获节点关联文本的语义丰富性与图的结构依赖性。尽管图神经网络擅长建模拓扑信息,但其缺乏处理非结构化文本的能力。反之,大语言模型精于文本理解,却通常无法感知图结构。本文提出BiGTex(双向图文架构),一种通过堆叠图文融合单元紧密集成图神经网络与大语言模型的新型架构。每个单元允许文本表示与结构表示之间进行相互注意力计算,实现信息在文本影响结构与结构引导文本解析的双向流动。该架构采用参数高效微调技术进行训练,在保持大语言模型冻结的同时适配任务特定信号。在五个基准数据集上的大量实验表明,BiGTex在节点分类任务中实现了最先进的性能,并能有效泛化至链接预测任务。消融研究进一步揭示了软提示与双向注意力机制对模型成功的关键作用。