Recently, research on Text-Attributed Graphs (TAGs) has gained significant attention due to the prevalence of free-text node features in real-world applications and the advancements in Large Language Models (LLMs) that bolster TAG methodologies. However, current TAG approaches face two primary challenges: (i) Heavy reliance on label information and (ii) Limited cross-domain zero/few-shot transferability. These issues constrain the scaling of both data and model size, owing to high labor costs and scaling laws, complicating the development of graph foundation models with strong transferability. In this work, we propose the GraphCLIP framework to address these challenges by learning graph foundation models with strong cross-domain zero/few-shot transferability through a self-supervised contrastive graph-summary pretraining method. Specifically, we generate and curate large-scale graph-summary pair data with the assistance of LLMs, and introduce a novel graph-summary pretraining method, combined with invariant learning, to enhance graph foundation models with strong cross-domain zero-shot transferability. For few-shot learning, we propose a novel graph prompt tuning technique aligned with our pretraining objective to mitigate catastrophic forgetting and minimize learning costs. Extensive experiments show the superiority of GraphCLIP in both zero-shot and few-shot settings, while evaluations across various downstream tasks confirm the versatility of GraphCLIP. Our code is available at: https://github.com/ZhuYun97/GraphCLIP
翻译:近年来,由于现实应用中自由文本节点特征的普遍存在以及大型语言模型(LLM)的发展推动了文本属性图(TAG)方法的研究,针对文本属性图的研究受到了广泛关注。然而,当前的TAG方法面临两大主要挑战:(i)对标签信息的严重依赖,以及(ii)有限的跨域零样本/少样本可迁移性。由于高昂的人工成本和扩展定律,这些问题限制了数据和模型规模的扩展,使得开发具有强大可迁移性的图基础模型变得复杂。在本工作中,我们提出了GraphCLIP框架,通过一种自监督的对比图-摘要预训练方法,学习具有强大跨域零样本/少样本可迁移性的图基础模型,以应对这些挑战。具体而言,我们在LLM的辅助下生成并整理大规模图-摘要对数据,并引入一种新颖的图-摘要预训练方法,结合不变性学习,以增强图基础模型的跨域零样本可迁移性。对于少样本学习,我们提出了一种与预训练目标对齐的新颖图提示调优技术,以减轻灾难性遗忘并最小化学习成本。大量实验表明GraphCLIP在零样本和少样本设置下均具有优越性,同时跨多种下游任务的评估也证实了GraphCLIP的通用性。我们的代码公开于:https://github.com/ZhuYun97/GraphCLIP