Graph foundation models (GFMs) emerged as a dominant paradigm in graph representation learning by leveraging large-scale pre-training for cross-domain inference. However, the parameterized knowledge encoded within these models is insufficient to cope with distribution shifts, limiting their generalization ability. To mitigate this issue, retrieval-augmented generation (RAG) has been introduced to incorporate external knowledge at inference time. Nevertheless, existing RAG frameworks operating in Euclidean space suffer from a fundamental geometric limitation: the polynomial volume growth of Euclidean space is inherently mismatched with the tree-structured external knowledge bases. This mismatch leads to the loss of semantic granularity in retrieval and gives rise to the hubness phenomenon.To address this limitation, we propose a Hyperbolic Retrieval-Augmented Generation (HyRAG) framework designed to enhance the generalization capabilities of GFMs. Specifically, the introduced Hyperbolic Knowledge Indexing module retains the tree-like hierarchies of the external knowledge base by modeling them within hyperbolic space. The Multi-granularity Retrieval module then provides GFMs with the global semantic anchors and local semantic nuances through coarse-grained and fine-grained knowledge retrieval, respectively. Finally, the Dual-path Fusion module achieves effective knowledge integration for graph tasks at both the feature and structural levels.Experiments on multiple graph benchmarks demonstrate significant improvements in the zero-shot setting, highlighting the generalization of our method for robust GFMs inference.
翻译:图基础模型通过大规模预训练实现跨领域推理,已成为图表示学习中的主流范式。然而,这些模型编码的参数化知识难以应对分布偏移,限制了其泛化能力。为缓解此问题,检索增强生成被引入,以便在推理阶段融入外部知识。然而,现有在欧几里得空间中运行的RAG框架存在根本性的几何局限:欧几里得空间的多项式体积增长与树状结构的外部知识库天然不匹配。这种不匹配导致检索中语义粒度的损失,并引发枢纽现象。为解决这一局限,我们提出双曲检索增强生成框架,旨在增强GFMs的泛化能力。具体而言,我们引入的双曲知识索引模块通过将外部知识库建模在双曲空间中,保留了其树状层级结构;多粒度检索模块则通过粗粒度与细粒度的知识检索,分别为GFMs提供全局语义锚点与局部语义细节;最后,双路径融合模块实现面向图任务的特征级与结构级知识有效整合。在多个图基准数据集上的实验表明,该方法在零样本场景下取得了显著提升,凸显了其鲁棒性GFMs推理的泛化能力。