Graph Neural Networks (GNNs) have evolved to understand graph structures through recursive exchanges and aggregations among nodes. To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation. Traditional methods often depend on fine-tuning with task-specific labels, limiting their effectiveness when labeled data is scarce. Our research tackles this by advancing graph model generalization in zero-shot learning environments. Inspired by the success of large language models (LLMs), we aim to create a graph-oriented LLM capable of exceptional generalization across various datasets and tasks without relying on downstream graph data. We introduce the GraphGPT framework, which integrates LLMs with graph structural knowledge through graph instruction tuning. This framework includes a text-graph grounding component to link textual and graph structures and a dual-stage instruction tuning approach with a lightweight graph-text alignment projector. These innovations allow LLMs to comprehend complex graph structures and enhance adaptability across diverse datasets and tasks. Our framework demonstrates superior generalization in both supervised and zero-shot graph learning tasks, surpassing existing benchmarks. The open-sourced model implementation of our GraphGPT is available at https://github.com/HKUDS/GraphGPT.
翻译:图神经网络(GNNs)通过节点间的递归交互与聚合机制,已实现对图结构信息的深度理解。为增强模型鲁棒性,自监督学习(SSL)成为数据增强的关键技术手段。传统方法通常依赖任务特定标签进行微调,在标注数据稀缺时效果受限。本研究通过推进零样本学习环境下图模型的泛化能力,致力于解决这一挑战。受大语言模型(LLMs)成功经验的启发,我们旨在构建面向图的LLM,使其无需依赖下游图数据即可在各类数据集与任务中展现卓越泛化能力。为此提出GraphGPT框架,该框架通过图指令微调将LLM与图结构知识深度融合。框架包含文本-图语义对齐组件(连接文本与图结构)、双阶段指令微调策略及轻量级图-文对齐投影器。这些创新设计使LLM能够理解复杂图结构,并增强其在多样化数据集与任务中的适应能力。实验表明,本框架在监督学习与零样本图学习任务中均展现出卓越泛化性能,全面超越现有基准方法。GraphGPT的开源模型实现已发布于https://github.com/HKUDS/GraphGPT。