Learning on Graphs has attracted immense attention due to its wide real-world applications. The most popular pipeline for learning on graphs with textual node attributes primarily relies on Graph Neural Networks (GNNs), and utilizes shallow text embedding as initial node representations, which has limitations in general knowledge and profound semantic understanding. In recent years, Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities that have revolutionized existing workflows to handle text data. In this paper, we aim to explore the potential of LLMs in graph machine learning, especially the node classification task, and investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors. The former leverages LLMs to enhance nodes' text attributes with their massive knowledge and then generate predictions through GNNs. The latter attempts to directly employ LLMs as standalone predictors. We conduct comprehensive and systematical studies on these two pipelines under various settings. From comprehensive empirical results, we make original observations and find new insights that open new possibilities and suggest promising directions to leverage LLMs for learning on graphs.
翻译:基于图的机器学习因其广泛的现实应用而受到极大关注。处理具有文本节点属性的图学习最常用的流程主要依赖于图神经网络(GNNs),并利用浅层文本嵌入作为初始节点表示,这在通用知识和深层语义理解方面存在局限。近年来,大型语言模型(LLMs)已被证明拥有广泛的常识和强大的语义理解能力,彻底革新了处理文本数据的现有工作流程。本文旨在探索LLMs在图机器学习中的潜力,特别是节点分类任务,并研究了两种可能的流程:LLMs作为增强器(LLMs-as-Enhancers)和LLMs作为预测器(LLMs-as-Predictors)。前者利用LLMs通过其海量知识增强节点的文本属性,随后通过GNNs生成预测;后者则尝试直接使用LLMs作为独立的预测器。我们在多种设置下对这两种流程进行了全面系统的研究。通过全面的实证结果,我们得到了原创性观察并发现了新的见解,这些见解开辟了新的可能性,并提出了利用LLMs进行图学习的有前景的方向。