Breaking the Static Graph: Context-Aware Traversal for Robust Retrieval-Augmented Generation

Recent advances in Retrieval-Augmented Generation (RAG) have shifted from simple vector similarity to structure-aware approaches like HippoRAG, which leverage Knowledge Graphs (KGs) and Personalized PageRank (PPR) to capture multi-hop dependencies. However, these methods suffer from a "Static Graph Fallacy": they rely on fixed transition probabilities determined during indexing. This rigidity ignores the query-dependent nature of edge relevance, causing semantic drift where random walks are diverted into high-degree "hub" nodes before reaching critical downstream evidence. Consequently, models often achieve high partial recall but fail to retrieve the complete evidence chain required for multi-hop queries. To address this, we propose CatRAG, Context-Aware Traversal for robust RAG, a framework that builds on the HippoRAG 2 architecture and transforms the static KG into a query-adaptive navigation structure. We introduce a multi-faceted framework to steer the random walk: (1) Symbolic Anchoring, which injects weak entity constraints to regularize the random walk; (2) Query-Aware Dynamic Edge Weighting, which dynamically modulates graph structure, to prune irrelevant paths while amplifying those aligned with the query's intent; and (3) Key-Fact Passage Weight Enhancement, a cost-efficient bias that structurally anchors the random walk to likely evidence. Experiments across four multi-hop benchmarks demonstrate that CatRAG consistently outperforms state of the art baselines. Our analysis reveals that while standard Recall metrics show modest gains, CatRAG achieves substantial improvements in reasoning completeness, the capacity to recover the entire evidence path without gaps. These results reveal that our approach effectively bridges the gap between retrieving partial context and enabling fully grounded reasoning. Resources are available at https://github.com/kwunhang/CatRAG.

翻译：检索增强生成（RAG）的最新进展已从简单的向量相似度匹配转向结构感知方法，如HippoRAG，其利用知识图谱（KG）和个性化PageRank（PPR）来捕捉多跳依赖关系。然而，这些方法存在“静态图谬误”：它们依赖于索引阶段确定的固定转移概率。这种刚性忽略了边相关性对查询的依赖性，导致语义漂移，即随机游走在抵达关键下游证据前被导向高度数“枢纽”节点。因此，模型往往能实现较高的局部召回率，却无法检索到多跳查询所需的完整证据链。为解决此问题，我们提出CatRAG（面向鲁棒RAG的上下文感知遍历），该框架基于HippoRAG 2架构，将静态知识图谱转化为查询自适应的导航结构。我们引入一个多层面框架来引导随机游走：（1）符号锚定，通过注入弱实体约束来正则化随机游走；（2）查询感知的动态边权重分配，动态调整图结构，以剪除无关路径并增强与查询意图对齐的路径；（3）关键事实段落权重增强，一种高效的结构性偏置，将随机游走锚定于可能的证据。在四个多跳基准测试上的实验表明，CatRAG始终优于现有基线方法。我们的分析揭示，虽然标准召回率指标仅显示适度提升，但CatRAG在推理完整性——即无遗漏地恢复整个证据路径的能力——上取得了显著改进。这些结果表明，我们的方法有效弥合了检索局部上下文与实现完全基于证据的推理之间的差距。相关资源可在 https://github.com/kwunhang/CatRAG 获取。