Triple-based Iterative Retrieval-Augmented Generation (iRAG) mitigates document-level noise for multi-hop question answering. However, existing methods still face limitations: (i) greedy single-path expansion, which propagates early errors and fails to capture parallel evidence from different reasoning branches, and (ii) granularity-demand mismatch, where a single evidence representation struggles to balance noise control with contextual sufficiency. In this paper, we propose the Construction-Integration Retrieval and Adaptive Generation model, CIRAG. It introduces an Iterative Construction-Integration module that constructs candidate triples and history-conditionally integrates them to distill core triples and generate the next-hop query. This module mitigates the greedy trap by preserving multiple plausible evidence chains. Besides, we propose an Adaptive Cascaded Multi-Granularity Generation module that progressively expands contextual evidence based on the problem requirements, from triples to supporting sentences and full passages. Moreover, we introduce Trajectory Distillation, which distills the teacher model's integration policy into a lightweight student, enabling efficient and reliable long-horizon reasoning. Extensive experiments demonstrate that CIRAG achieves superior performance compared to existing iRAG methods.
翻译:基于三元组的迭代检索增强生成(iRAG)方法缓解了文档级噪声对多跳问答的影响。然而,现有方法仍面临以下局限:(i)贪心的单路径扩展,这会传播早期错误且无法从不同推理分支捕获并行证据;(ii)粒度需求失配,即单一证据表示难以在噪声控制与上下文充分性之间取得平衡。本文提出构建-集成检索与自适应生成模型CIRAG。该模型引入迭代构建-集成模块,通过构建候选三元组并基于历史条件进行集成,以提炼核心三元组并生成下一跳查询。该模块通过保留多条合理证据链来规避贪心陷阱。此外,我们提出自适应级联多粒度生成模块,可根据问题需求从三元组到支撑句再到完整段落逐步扩展上下文证据。进一步,我们引入轨迹蒸馏技术,将教师模型的集成策略提炼至轻量级学生模型,从而实现高效可靠的长程推理。大量实验表明,CIRAG相较于现有iRAG方法取得了更优越的性能。