G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

Linhao Luo,Zicheng Zhao,Junnan Liu,Zhangchi Qiu,Junnan Dong,Serge Panev,Chen Gong,Thuy-Trang Vu,Gholamreza Haffari,Dinh Phung,Alan Wee-Chung Liew,Shirui Pan

from arxiv, Accepted by ICLR 2026

Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge. Retrieval-augmented generation (RAG) mitigates this by incorporating external knowledge, yet existing RAGs struggle with knowledge-intensive tasks due to fragmented information and weak modeling of knowledge structure. Graphs offer a natural way to model relationships within knowledge, but LLMs are inherently unstructured and cannot effectively reason over graph-structured data. Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them. However, these methods often depend on ad-hoc graph designs, heuristic search, or costly agent pipelines, which hinder scalability and generalization. To address these challenges, we present G-reasoner, a unified framework that integrates graph and language foundation models for scalable reasoning over diverse graph-structured knowledge. Central to our approach is QuadGraph, a standardized four-layer abstraction that unifies heterogeneous knowledge sources into a common graph representation. Building on this, we introduce a 34M-parameter graph foundation model (GFM) that jointly captures graph topology and textual semantics, and is integrated with LLMs to enhance reasoning in downstream applications. To ensure scalability and efficiency, mixed-precision training and distributed message-passing are implemented to scale GFM with more GPUs. Extensive experiments on six benchmarks show that G-reasoner consistently outperforms state-of-the-art baselines, significantly enhances LLM reasoning, and achieves strong efficiency and cross-graph generalization.

翻译：大型语言模型（LLMs）在复杂推理任务上表现出色，但仍受限于静态且不完整的参数化知识。检索增强生成（RAG）通过引入外部知识来缓解这一问题，然而现有的RAG方法由于信息碎片化以及对知识结构建模能力不足，在处理知识密集型任务时仍面临困难。图结构为知识内部关系建模提供了一种自然的方式，但LLMs本质上是非结构化的，无法有效地对图结构数据进行推理。近期的图增强RAG（GraphRAG）尝试通过构建定制化图并让LLMs在其上进行推理来弥合这一差距。然而，这些方法通常依赖于临时性的图设计、启发式搜索或成本高昂的智能体流程，这阻碍了其可扩展性与泛化能力。为应对这些挑战，我们提出了G-reasoner，一个统一的框架，它整合了图与语言基础模型，旨在实现对多样化图结构知识的可扩展推理。我们方法的核心是QuadGraph，一种标准化的四层抽象，它将异构知识源统一为通用的图表示。在此基础上，我们引入了一个拥有3400万参数的图基础模型（GFM），该模型联合捕获图拓扑结构与文本语义，并与LLMs集成以增强下游应用中的推理能力。为确保可扩展性与效率，我们实施了混合精度训练与分布式消息传递，以利用更多GPU扩展GFM。在六个基准测试上进行的大量实验表明，G-reasoner始终优于最先进的基线方法，显著提升了LLM的推理能力，并实现了强大的效率与跨图泛化性能。