Learning from Text-Attributed Graphs (TAGs) has attracted significant attention due to its wide range of real-world applications. The rapid evolution of large language models (LLMs) has revolutionized the way we process textual data, which indicates a strong potential to replace shallow text embedding generally used in Graph Neural Networks (GNNs). However, we find that existing LLM approaches that exploit text information in graphs suffer from inferior computation and data efficiency. In this work, we introduce a novel and efficient approach for the end-to-end fine-tuning of Large Language Models (LLMs) on TAGs, named LEADING. The proposed approach maintains computation cost and memory overhead comparable to the graph-less fine-tuning of LLMs. Moreover, it transfers the rick knowledge in LLMs to downstream graph learning tasks effectively with limited labeled data in semi-supervised learning. Its superior computation and data efficiency are demonstrated through comprehensive experiments, offering a promising solution for a wide range of LLMs and graph learning tasks on TAGs.
翻译:从文本属性图(TAGs)中学习因广泛的实际应用而备受关注。大语言模型(LLMs)的快速发展革新了文本数据处理方式,显示出替代图神经网络(GNNs)中常用浅层文本嵌入的强大潜力。然而,我们发现现有利用图中文本信息的LLM方法存在计算与数据效率低下的问题。本文提出了一种新颖高效的端到端微调大语言模型(LLMs)的方法,名为LEADING。该方法在TAGs上的计算成本和内存开销与无图的LLM微调相当。此外,它能将LLM中的丰富知识有效迁移到下游图学习任务中,在半监督学习场景下仅需有限标注数据即可实现。通过全面实验证明了其优越的计算与数据效率,为TAGs上多种LLM与图学习任务提供了有前景的解决方案。