Research on Text-Attributed Graphs (TAGs) has gained significant attention recently due to its broad applications across various real-world data scenarios, such as citation networks, e-commerce platforms, social media, and web pages. Inspired by the remarkable semantic understanding ability of Large Language Models (LLMs), there have been numerous attempts to integrate LLMs into TAGs. However, existing methods still struggle to generalize across diverse graphs and tasks, and their ability to capture transferable graph structural patterns remains limited. To address this, we introduce the GraspLLM, a framework that combines Graph structural comprehension with semantic understanding prowess of LLMs to enhance the cross-dataset and cross-task generalizability. Specifically, we represent node texts from different graphs in a unified semantic space with a frozen general embedding model, on top of which we perform motif-aware contrastive learning across multiple motif-induced adjacency matrices to extract dataset-agnostic structural information. Then, with our proposed optimal contextual subgraph, we extract the most contextually relevant subgraph for each target node and align these subgraphs to the token space of LLM via an alignment projector. Extensive experiments on TAG benchmark datasets spanning diverse domains reveal that GraspLLM consistently outperforms previous LLM-based methods for TAGs, especially in zero-shot scenarios, highlighting its strong generalizability across different datasets and tasks. Our code is available at https://github.com/Heinz217/GraspLLM.
翻译:近年来,文本属性图(TAGs)因其在引文网络、电子商务平台、社交媒体及网页等真实世界数据场景中的广泛应用而受到显著关注。受大型语言模型(LLMs)卓越语义理解能力的启发,已有诸多尝试将LLMs融入TAGs。然而,现有方法仍难以跨不同图结构与任务进行泛化,其对可迁移图结构模式的捕捉能力有限。为此,我们提出了GraspLLM框架,该框架将图结构理解与LLMs的语义理解能力相结合,以增强跨数据集与跨任务的泛化能力。具体而言,我们通过冻结的通用嵌入模型将不同图的节点文本映射至统一语义空间,并在此基础上,基于多个模体诱导的邻接矩阵执行模体感知对比学习,以提取与数据集无关的结构信息。随后,利用我们提出的最优上下文子图方法,为每个目标节点提取最相关的上下文子图,并通过对齐投影器将这些子图映射至LLM的令牌空间。在跨多样领域的TAG基准数据集上的大量实验表明,GraspLLM始终优于以往基于LLM的TAG方法,尤其在零样本场景下表现突出,突显了其在不同数据集与任务间的强泛化能力。我们的代码已开源:https://github.com/Heinz217/GraspLLM。