Learning on Graphs has attracted immense attention due to its wide real-world applications. The most popular pipeline for learning on graphs with textual node attributes primarily relies on Graph Neural Networks (GNNs), and utilizes shallow text embedding as initial node representations, which has limitations in general knowledge and profound semantic understanding. In recent years, Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities that have revolutionized existing workflows to handle text data. In this paper, we aim to explore the potential of LLMs in graph machine learning, especially the node classification task, and investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors. The former leverages LLMs to enhance nodes' text attributes with their massive knowledge and then generate predictions through GNNs. The latter attempts to directly employ LLMs as standalone predictors. We conduct comprehensive and systematical studies on these two pipelines under various settings. From comprehensive empirical results, we make original observations and find new insights that open new possibilities and suggest promising directions to leverage LLMs for learning on graphs.
翻译:基于图的机器学习因其广泛的实际应用而受到极大关注。处理带有文本节点属性的图的主流方法主要依赖图神经网络(GNNs),并使用浅层文本嵌入作为初始节点表示,这在通用知识和深层语义理解方面存在局限性。近年来,大型语言模型(LLMs)已被证明具备广泛的常识知识和强大的语义理解能力,彻底改变了处理文本数据的现有工作流程。本文旨在探索LLMs在图机器学习中的潜力,特别是节点分类任务,并研究两种可能的流程:LLMs作为增强器(LLMs-as-Enhancers)和LLMs作为预测器(LLMs-as-Predictors)。前者利用LLMs的丰富知识增强节点的文本属性,然后通过GNNs生成预测;后者则尝试直接使用LLMs作为独立预测器。我们在多种设置下对这两种流程进行了全面系统的研究。基于综合实验结果的原始观察和新见解,我们开辟了新的可能性,并提出了利用LLMs进行图学习的有前景方向。