Representation learning on text-attributed graphs (TAGs) has become a critical research problem in recent years. A typical example of a TAG is a paper citation graph, where the text of each paper serves as node attributes. Initial graph neural network (GNN) pipelines handled these text attributes by transforming them into shallow or hand-crafted features, such as skip-gram or bag-of-words features. Recent efforts have focused on enhancing these pipelines with language models (LMs), which typically demand intricate designs and substantial computational resources. With the advent of powerful large language models (LLMs) such as GPT or Llama2, which demonstrate an ability to reason and to utilize general knowledge, there is a growing need for techniques which combine the textual modelling abilities of LLMs with the structural learning capabilities of GNNs. Hence, in this work, we focus on leveraging LLMs to capture textual information as features, which can be used to boost GNN performance on downstream tasks. A key innovation is our use of explanations as features: we prompt an LLM to perform zero-shot classification, request textual explanations for its decision-making process, and design an LLM-to-LM interpreter to translate these explanations into informative features that enhance downstream GNNs. Our experiments demonstrate that our method achieves state-of-the-art results on well-established TAG datasets, including Cora, PubMed, ogbn-arxiv, as well as our newly introduced dataset, arXiv-2023. Furthermore, our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv. Lastly, we believe the versatility of the proposed method extends beyond TAGs and holds the potential to enhance other tasks involving graph-text data~\footnote{Our codes and datasets are available at: \url{https://github.com/XiaoxinHe/TAPE}}.
翻译:近年来,文本属性图上的表示学习已成为一个关键研究问题。文本属性图的典型示例是论文引用图,其中每篇论文的文本作为节点属性。早期的图神经网络流程通过将文本属性转化为浅层或手工特征(如skip-gram或词袋特征)来处理这些文本。近期研究致力于利用语言模型增强这些流程,但通常需要复杂的设计和大量计算资源。随着GPT或Llama2等强大大型语言模型的出现——它们展现出推理和运用通用知识的能力——亟需结合LLM文本建模能力与GNN结构学习能力的技术。因此,本文重点利用LLM捕捉文本信息作为特征,以提升GNN在下游任务中的性能。关键创新在于使用解释作为特征:我们提示LLM进行零样本分类,要求其提供决策过程的文本解释,并设计LLM到LM解释器将这些解释转化为增强下游GNN的信息特征。实验表明,我们的方法在包括Cora、PubMed、ogbn-arxiv及新引入的arXiv-2023数据集在内的公认文本属性图数据集上取得了最先进结果。此外,该方法显著加速训练,在ogbn-arxiv上相比最接近的基线实现了2.88倍的速度提升。最后,我们认为所提方法的通用性不止于文本属性图,具有增强其他涉及图-文本数据任务的潜力。