Agentic knowledge graph question answering (KGQA) requires an agent to iteratively interact with knowledge graphs (KGs), posing challenges in both training data scarcity and reasoning generalization. Specifically, existing approaches often restrict agent exploration: prompting-based methods lack autonomous navigation training, while current training pipelines usually confine reasoning to predefined trajectories. To this end, this paper proposes \textit{GraphWalker}, a novel agentic KGQA framework that addresses these challenges through \textit{Automated Trajectory Synthesis} and \textit{Stage-wise Fine-tuning}. GraphWalker adopts a two-stage SFT training paradigm: First, the agent is trained on structurally diverse trajectories synthesized from constrained random-walk paths, establishing a broad exploration prior over the KG; Second, the agent is further fine-tuned on a small set of expert trajectories to develop reflection and error recovery capabilities. Extensive experiments demonstrate that our stage-wise SFT paradigm unlocks a higher performance ceiling for a lightweight reinforcement learning (RL) stage, enabling GraphWalker to achieve state-of-the-art performance on CWQ and WebQSP. Additional results on GrailQA and our constructed GraphWalkerBench confirm that GraphWalker enhances generalization to out-of-distribution reasoning paths. The code is publicly available at https://github.com/XuShuwenn/GraphWalker
翻译:[translated abstract in Chinese]
智能知识图谱问答要求智能体与知识图谱进行迭代交互,这带来了训练数据稀缺和推理泛化两方面的挑战。具体而言,现有方法通常限制智能体的探索:基于提示的方法缺乏自主导航训练,而当前的训练流程通常将推理局限在预定义轨迹内。为此,本文提出\textit{GraphWalker},一种新颖的智能知识图谱问答框架,通过\textit{自动轨迹合成}和\textit{分阶段微调}来解决这些挑战。GraphWalker采用两阶段SFT训练范式:首先,在从约束随机游走路径合成的结构多样化轨迹上训练智能体,使其建立对知识图谱的广泛探索先验;其次,在少量专家轨迹上进一步微调智能体,以发展反思和错误恢复能力。大量实验表明,我们的分阶段SFT范式为轻量级强化学习阶段解锁了更高的性能天花板,使GraphWalker在CWQ和WebQSP上取得了最先进的性能。在GrailQA和我们构建的GraphWalkerBench上的额外结果证实,GraphWalker增强了对分布外推理路径的泛化能力。代码已在 https://github.com/XuShuwenn/GraphWalker 公开。