DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval

Retrieval-augmented generation has achieved strong performance on knowledge-intensive tasks where query-document relevance can be identified through direct lexical or semantic matches. However, many real-world queries involve abstract reasoning, analogical thinking, or multi-step inference, which existing retrievers often struggle to capture. To address this challenge, we present DIVER, a retrieval pipeline designed for reasoning-intensive information retrieval. It consists of four components. The document preprocessing stage enhances readability and preserves content by cleaning noisy texts and segmenting long documents. The query expansion stage leverages large language models to iteratively refine user queries with explicit reasoning and evidence from retrieved documents. The retrieval stage employs a model fine-tuned on synthetic data spanning medical and mathematical domains, along with hard negatives, enabling effective handling of reasoning-intensive queries. Finally, the reranking stage combines pointwise and listwise strategies to produce both fine-grained and globally consistent rankings. On the BRIGHT benchmark, DIVER achieves state-of-the-art nDCG@10 scores of 46.8 overall and 31.9 on original queries, consistently outperforming competitive reasoning-aware models. These results demonstrate the effectiveness of reasoning-aware retrieval strategies in complex real-world tasks.

翻译：检索增强生成在知识密集型任务中已取得显著性能，这些任务中查询与文档的相关性可通过直接的词汇或语义匹配加以识别。然而，许多现实查询涉及抽象推理、类比思维或多步推断，现有检索器往往难以捕捉此类信息。为应对这一挑战，我们提出了DIVER——一种专为推理密集型信息检索设计的检索流水线。它由四个组件构成：文档预处理阶段通过清理噪声文本和切分长文档来提升可读性并保留内容；查询扩展阶段利用大型语言模型，借助从检索文档中获取的显式推理过程和证据，对用户查询进行迭代优化；检索阶段采用基于合成数据（覆盖医学和数学领域）及难负样本微调的模型，从而有效处理推理密集型查询；最后，重排序阶段结合逐点与列表式排序策略，生成兼具细粒度与全局一致性的排序结果。在BRIGHT基准测试中，DIVER在整体查询上取得了46.8的nDCG@10最优得分，在原始查询上达到31.9，并持续优于其他竞争性的推理感知模型。这些结果表明了推理感知型检索策略在复杂现实任务中的有效性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【CVPR2026】DiverseDiT: 迈向扩散 Transformer 中的多样化表示学习

专知会员服务

8+阅读 · 3月9日

【WWW2025】ImageScope：通过大型多模态模型集体推理统一语言引导的图像检索

专知会员服务

12+阅读 · 2025年4月22日

【SIGIR2024】生成检索作即多向量密集检索

专知会员服务

23+阅读 · 2024年4月5日

【WWW2024】元认知检索-增强大型语言模型

专知会员服务

50+阅读 · 2024年2月26日