Graph-based retrieval-augmented generation (GraphRAG) exploits structured knowledge to support knowledge-intensive reasoning. However, most existing methods treat graphs as intermediate artifacts, and the few subgraph-based retrieval methods depend on heuristic rules coupled with domain-specific distributions. They fail in typical cold-start scenarios where data in target domains is scarce, thus yielding reasoning contexts that are either informationally incomplete or structurally redundant. In this work, we revisit retrieval from a structural perspective, and propose GFM-Retriever that directly responds to user queries with a subgraph, where a pre-trained Graph Foundation Model acts as a cross-domain Retriever for multi-hop path-aware reasoning. Building on this perspective, we repurpose a pre-trained GFM from an entity ranking function into a generalized retriever to support cross-domain retrieval. On top of the retrieved graph, we further derive a label-free subgraph selector optimized by a principled Information Bottleneck objective to identify the query-conditioned subgraph, which contains informationally sufficient and structurally minimal golden evidence in a self-contained "core set". To connect structure with generation, we explicitly extract and reorganize relational paths as in-context prompts, enabling interpretable reasoning. Extensive experiments on multi-hop question answering benchmarks demonstrate that GFM-Retriever achieves state-of-the-art performance in both retrieval quality and answer generation, while maintaining efficiency.
翻译:基于图的检索增强生成(GraphRAG)利用结构化知识来支持知识密集型推理。然而,现有方法大多将图视为中间产物,少数基于子图的检索方法则依赖于与领域特定分布耦合的启发式规则。这些方法在目标领域数据稀缺的典型冷启动场景中往往失效,导致产生的推理上下文要么信息不完整,要么结构冗余。本研究从结构视角重新审视检索问题,提出GFM-Retriever方法,其通过预训练的图基础模型作为跨领域检索器,直接响应用户查询并返回用于多跳路径感知推理的子图。基于此视角,我们将预训练的GFM从实体排序函数重构为通用检索器以支持跨领域检索。在检索得到的图结构基础上,我们进一步推导出无标签子图选择器,通过基于信息瓶颈原理的目标函数进行优化,从而识别出查询条件化的子图。该子图以自包含的“核心集”形式,包含信息充分且结构最小化的黄金证据。为连接结构与生成过程,我们显式提取并重组关系路径作为上下文提示,实现可解释的推理。在多跳问答基准上的大量实验表明,GFM-Retriever在检索质量和答案生成方面均达到最先进的性能,同时保持高效性。