Representation learning on text-attributed graphs (TAGs) has become a critical research problem in recent years. A typical example of a TAG is a paper citation graph, where the text of each paper serves as node attributes. Initial graph neural network (GNN) pipelines handled these text attributes by transforming them into shallow or hand-crafted features, such as skip-gram or bag-of-words features. Recent efforts have focused on enhancing these pipelines with language models (LMs), which typically demand intricate designs and substantial computational resources. With the advent of powerful large language models (LLMs) such as GPT or Llama2, which demonstrate an ability to reason and to utilize general knowledge, there is a growing need for techniques which combine the textual modelling abilities of LLMs with the structural learning capabilities of GNNs. Hence, in this work, we focus on leveraging LLMs to capture textual information as features, which can be used to boost GNN performance on downstream tasks. A key innovation is our use of explanations as features: we prompt an LLM to perform zero-shot classification, request textual explanations for its decision-making process, and design an LLM-to-LM interpreter to translate these explanations into informative features for downstream GNNs. Our experiments demonstrate that our method achieves state-of-the-art results on well-established TAG datasets, including Cora, PubMed, ogbn-arxiv, as well as our newly introduced dataset, tape-arxiv23. Furthermore, our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv. Lastly, we believe the versatility of the proposed method extends beyond TAGs and holds the potential to enhance other tasks involving graph-text data. Our codes and datasets are available at: https://github.com/XiaoxinHe/TAPE.
翻译:文本属性图(TAG)上的表示学习近年来已成为一项关键研究问题。文本属性图的典型示例是论文引用图,其中每篇论文的文本充当节点属性。早期的图神经网络(GNN)流程通过将这些文本属性转化为浅层或手工设计的特征(如skip-gram或词袋特征)来处理。最近的研究致力于利用语言模型(LM)增强这些流程,但通常需要复杂的设计和大量的计算资源。随着GPT或Llama2等强大大型语言模型(LLM)的出现——它们展现出推理和利用通用知识的能力——人们越来越需要将LLM的文本建模能力与GNN的结构学习能力相结合的技术。因此,本文聚焦于利用LLM捕捉文本信息作为特征,以提升GNN在下游任务中的性能。一个关键创新在于我们将解释用作特征:我们提示LLM执行零样本分类,请求其提供决策过程文本解释,并设计一个LLM到LM的解释器,将这些解释转化为下游GNN的有效特征。实验结果表明,我们的方法在多个成熟的TAG数据集(包括Cora、PubMed、ogbn-arxiv以及我们新引入的数据集tape-arxiv23)上取得了最先进的结果。此外,该方法显著加速了训练过程,在ogbn-arxiv上相比最接近的基线实现了2.88倍的速度提升。最后,我们相信所提方法的通用性不仅限于TAG,还有潜力增强涉及图-文本数据的其他任务。我们的代码和数据集可在 https://github.com/XiaoxinHe/TAPE 获取。