Graph foundation models (GFMs) emerged as a dominant paradigm in graph representation learning by leveraging large-scale pre-training for cross-domain inference. However, the parameterized knowledge encoded within these models is insufficient to cope with distribution shifts, limiting their generalization ability. To mitigate this issue, retrieval-augmented generation (RAG) has been introduced to incorporate external knowledge at inference time. Nevertheless, existing RAG frameworks operating in Euclidean space suffer from a fundamental geometric limitation: the polynomial volume growth of Euclidean space is inherently mismatched with the tree-structured external knowledge bases. This mismatch leads to the loss of semantic granularity in retrieval and gives rise to the hubness phenomenon.To address this limitation, we propose a Hyperbolic Retrieval-Augmented Generation (HyRAG) framework designed to enhance the generalization capabilities of GFMs. Specifically, the introduced Hyperbolic Knowledge Indexing module retains the tree-like hierarchies of the external knowledge base by modeling them within hyperbolic space. The Multi-granularity Retrieval module then provides GFMs with the global semantic anchors and local semantic nuances through coarse-grained and fine-grained knowledge retrieval, respectively. Finally, the Dual-path Fusion module achieves effective knowledge integration for graph tasks at both the feature and structural levels. Experiments on multiple graph benchmarks demonstrate significant improvements in the zero-shot setting, highlighting the generalization of our method for robust GFMs inference.
翻译:图基础模型(GFMs)通过利用大规模预训练实现跨领域推理,已成为图表示学习中的主导范式。然而,这些模型编码的参数化知识难以应对分布偏移,限制了其泛化能力。为缓解该问题,检索增强生成(RAG)被引入以在推理时融合外部知识。但现有基于欧氏空间的RAG框架存在根本性的几何缺陷:欧氏空间的多项式级体积增长与树结构外部知识库天然不匹配,这种错配导致检索中语义粒度的损失并引发中心性现象(hubness phenomenon)。为解决这一局限,我们提出双曲检索增强生成(HyRAG)框架,旨在增强GFMs的泛化能力。具体而言,所引入的双曲知识索引模块通过将外部知识库建模至双曲空间来保留其树状层级结构。多粒度检索模块则通过粗粒度与细粒度的知识检索,分别为GFMs提供全局语义锚点与局部语义细节。最终,双路径融合模块在特征与结构两个层面实现图任务的有效知识整合。在多个图基准上的实验表明,本文方法在零样本场景下取得了显著提升,充分证明了HyRAG在鲁棒GFMs推理中的泛化优势。