Learning on Graphs has attracted immense attention due to its wide real-world applications. The most popular pipeline for learning on graphs with textual node attributes primarily relies on Graph Neural Networks (GNNs), and utilizes shallow text embedding as initial node representations, which has limitations in general knowledge and profound semantic understanding. In recent years, Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities that have revolutionized existing workflows to handle text data. In this paper, we aim to explore the potential of LLMs in graph machine learning, especially the node classification task, and investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors. The former leverages LLMs to enhance nodes' text attributes with their massive knowledge and then generate predictions through GNNs. The latter attempts to directly employ LLMs as standalone predictors. We conduct comprehensive and systematical studies on these two pipelines under various settings. From comprehensive empirical results, we make original observations and find new insights that open new possibilities and suggest promising directions to leverage LLMs for learning on graphs. Our codes and datasets are available at https://github.com/CurryTang/Graph-LLM.
翻译:图学习因其广泛的实际应用而备受关注。当前最流行的基于文本节点属性的图学习流程主要依赖图神经网络,并使用浅层文本嵌入作为初始节点表示,这在通用知识和深层语义理解方面存在局限性。近年来,大型语言模型已被证明具备广泛的常识知识和强大的语义理解能力,彻底革新了现有处理文本数据的工作流程。本文旨在探索LLM在图机器学习中的潜力,特别是节点分类任务,并研究两种可能的流程:LLM作为增强器与LLM作为预测器。前者利用LLM的海量知识增强节点文本属性,然后通过GNN生成预测;后者则尝试直接使用LLM作为独立预测器。我们在各种设置下对这两种流程进行了全面系统的研究。通过综合实证结果,我们提出了原创性观察并发现新见解,这些见解开辟了新的可能性,为利用LLM进行图学习指明了有前景的研究方向。我们的代码和数据集可在 https://github.com/CurryTang/Graph-LLM 获取。