Reasoning over structured graphs remains a fundamental challenge for Large Language Models (LLMs), particularly when scaling to large graphs. Existing approaches typically follow the retrieval-augmented generation (RAG) paradigm: first retrieving subgraphs relevant to the query and then generating answers conditioned on the retrieved subgraphs. However, such two-phase pipelines often struggle to faithfully incorporate graph structure, since the generation process is ultimately constrained by the quality and completeness of the retrieved subgraph. Although many advanced retrievers have been proposed recently to mitigate this issue, they are usually tailored to the training graphs and generalize poorly to unseen graphs, which limits their practical applicability. In this work, we propose Reasoning by Exploration (RoE), a novel approach that unifies retrieval and generation by framing reasoning over graphs as a process of graph exploration. At each step, the LLM selects candidate nodes and edges to explore, gradually constructing reasoning paths and generating answers along the way. To enable effective exploration, RoE is trained in two stages: supervised fine-tuning (SFT) on gold reasoning paths, followed by reinforcement learning (RL) to enhance exploration effectiveness and generalization. Experiments on benchmark datasets demonstrate that RoE achieves substantial overall improvements over baselines, while also generalizing effectively to unseen graphs.
翻译:在结构化图上进行推理对于大型语言模型(LLM)而言仍然是一个根本性挑战,尤其是在处理大规模图时。现有方法通常遵循检索增强生成(RAG)范式:首先检索与查询相关的子图,然后基于检索到的子图生成答案。然而,这种两阶段流程往往难以忠实地融入图结构,因为生成过程最终受限于检索子图的质量与完整性。尽管近期已提出许多先进的检索器以缓解此问题,但它们通常针对训练图进行定制,对未见图的泛化能力较差,这限制了其实际应用。本文提出基于探索的推理(RoE),这是一种通过将图上的推理过程构建为图探索流程来统一检索与生成的新方法。在每一步中,LLM选择待探索的候选节点与边,逐步构建推理路径并在此过程中生成答案。为实现高效探索,RoE采用两阶段训练:首先在标准推理路径上进行监督微调(SFT),随后通过强化学习(RL)提升探索效率与泛化能力。在基准数据集上的实验表明,RoE相较于基线方法实现了显著的整体性能提升,同时能有效泛化至未见图结构。