Recently, large language models (LLMs) have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social media, recommendation systems, etc. Thus, this paper explores how to utilize LLMs to model TAGs. Previous methods for TAG modeling are based on million-scale LMs. When scaled up to billion-scale LLMs, they face huge challenges in computational costs. Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. In terms of efficiency, the GNN adapter introduces only a few trainable parameters and can be trained with low computation costs. The entire framework is trained using auto-regression on node text (next token prediction). Once trained, GraphAdapter can be seamlessly fine-tuned with task-specific prompts for various downstream tasks. Through extensive experiments across multiple real-world TAGs, GraphAdapter based on Llama 2 gains an average improvement of approximately 5\% in terms of node classification. Furthermore, GraphAdapter can also adapt to other language models, including RoBERTa, GPT-2. The promising results demonstrate that GNNs can serve as effective adapters for LLMs in TAG modeling.
翻译:近期,大型语言模型(LLM)在文本数据的理解与零样本学习方面展现出卓越能力,有望为众多文本相关领域带来重大突破。在图领域中,多种现实场景同样涉及文本数据,其任务和节点特征可通过文本描述。此类文本属性图(TAG)在社交媒体、推荐系统等领域具有广泛应用。因此,本文探索如何利用LLM对TAG进行建模。以往的TAG建模方法基于百万级语言模型,当扩展至十亿级LLM时面临巨大的计算成本挑战;此外,这些方法也忽略了LLM的零样本推理能力。为此,我们提出GraphAdapter,采用图神经网络(GNN)作为高效适配器与LLM协同处理TAG。在效率方面,GNN适配器仅引入少量可训练参数,且训练计算成本较低。整个框架通过节点文本的自回归(预测下一token)进行训练。训练完成后,GraphAdapter可通过任务特定提示无缝微调,适用于各类下游任务。在多个真实TAG上的大量实验表明,基于Llama 2的GraphAdapter在节点分类任务中平均性能提升约5%。此外,GraphAdapter还可适配其他语言模型(包括RoBERTa、GPT-2)。这些显著结果证明:在TAG建模中,GNN可作为LLM的有效适配器。