Graph-based retrieval-augmented generation (GraphRAG) is effective for knowledge-intensive and multi-hop query tasks; however, many existing methods primarily seed entity-based graphs and rely on implicit semantic relevance propagation. This often (i) under-retrieves when user queries are abstract and semantically sparse at the entity level, and (ii) suffers from brittle multi-hop reasoning, where noisy activations can derail entity-to-entity transitions and corrupt the inferred relation chain, yielding unreliable conclusions. To this end, we propose \texttt{FlowRAG}, a semantic-aware retrieval framework that improves both semantic recall and explicit reasoning. Specifically, \texttt{FlowRAG} constructs a quad-level heterogeneous graph over passages, summaries, sentences, and entities, where summary nodes serve as a coarse semantic hub. At retrieval time, a dual-granularity activation module combines summary--query alignment with sentence-level matching to activate relevant entities under paraphrase and abstraction robustly. We then introduce a frequency-aware weighted flow module that routes relevance through entity--passage links weighted by within-passage term frequency, pruning noisy connections and extracting high-confidence reasoning paths as an explicit logic skeleton for generation. Extensive experiments show that \texttt{FlowRAG} obtains state-of-the-art performance on complex reasoning benchmarks.
翻译:基于图的检索增强生成(GraphRAG)在知识密集型与多跳查询任务中展现出有效性;然而,现有方法多侧重于构建基于实体的图结构并依赖隐式语义相关性传播。这通常会导致以下问题:(i)当用户查询抽象且在实体层面语义稀疏时,检索能力不足;(ii)多跳推理脆弱,噪声激活会干扰实体间转移路径并破坏推断关系链,最终产生不可靠结论。为此,本文提出\texttt{FlowRAG}——一种语义感知的检索框架,可同时提升语义召回率与显式推理能力。具体而言,\texttt{FlowRAG}构建了一个包含段落、摘要、句子和实体的四层异构图,其中摘要节点作为粗粒度语义枢纽。在检索阶段,双粒度激活模块通过结合摘要-查询对齐与句子级匹配,稳健激活同义改写与抽象查询下的相关实体。我们进一步引入频率感知加权流模块,通过基于段落内词频加权的实体-段落边进行相关性路由,剪除噪声连接并提取高置信度推理路径作为生成的显式逻辑骨架。大量实验表明,\texttt{FlowRAG}在复杂推理基准测试中取得了最优性能。