Retrieval augmented generation (RAG) depends critically on the quality and granularity of retrieved evidence. Large retrieval units preserve context but often introduce irrelevant content, which can dilute answer bearing evidence and worsen long context utilization. Fine-grained units are more compact, but they may be difficult to retrieve reliably because short chunks can lack semantic, lexical, or bridging cues needed to match the query. We propose Uncertainty-aware Multi-Granularity RAG (UMG-RAG), a training-free hybrid retrieval framework that treats chunk granularity as query-specific reliability estimation. Instead of training a new retriever or modifying the generator, UMG-RAG uses existing dense and sparse retrievers as complementary experts across multiple chunk granularities. For each query, it converts each expert-granularity score list into an evidence distribution, estimates reliability from distribution entropy, and fuses candidates according to query-specific semantic, lexical, and granularity confidence. We further introduce UMGP-RAG, a parent promotion variant that uses fine-grained hits to locate relevant evidence while returning broader non-redundant parent chunks for local coherence. Experiments on question answering benchmarks show that uncertainty-aware fusion and parent promotion improve generation quality while maintaining a lightweight, plug-and-play retrieval pipeline.
翻译:检索增强生成(RAG)的核心依赖于所检索证据的质量与粒度。大粒度检索单元虽能保留上下文,但常引入无关内容,稀释证据有效性并加剧长上下文利用问题;细粒度单元结构紧凑,但因短片段缺乏匹配查询所需的语义、词汇或桥接线索,可能导致检索可靠性不足。本文提出面向不确定性的多粒度RAG框架(UMG-RAG),这是一种免训练的混合检索方法,将分块粒度视为查询相关的可靠性估计机制。无需训练新检索器或修改生成器,UMG-RAG利用现有稠密与稀疏检索器作为多粒度互补专家。对于每个查询,该方法将各专家-粒度评分列表转化为证据分布,通过分布熵评估可靠性,并根据查询特定的语义、词汇与粒度置信度融合候选结果。进一步提出父级提升变体UMGP-RAG,利用细粒度命中定位相关证据,同时返回更广泛的非冗余父块以维护局部连贯性。问答基准实验表明,不确定性感知融合与父级提升策略能在保持轻量即插即用检索流程的同时提升生成质量。