The classical cascading pipeline of retrieve--rerank suffers from a bounded recall problem, stemming from limitations of the first-stage retriever. Most current approaches address the bounded recall problem by improving the first-stage retriever, but this incurs substantial training and inference costs, especially to handle queries that require substantial reasoning. To circumvent the computational costs of reasoning-based retrievers, we replicate the findings of GAR, Graph-based Adaptive Reranking, on the BRIGHT reasoning-intensive retrieval benchmark. GAR addresses the bounded recall problem by modifying the reranking process itself through iterative exploration of a corpus graph, but it was previously only tested on models designed for topical and question-answering-style queries. Hence, reproduce GAR in reasoning-intensive settings with reasoning and non-reasoning reranking models. We observe that the quality of the reranker's signal plays an important role in identifying additional relevant documents within the corpus graph. Overall, we find that GAR boosts the effectiveness of reasoning-intensive retrieval across a variety of models while contributing minimally to computational overheads. Ultimately, this work enables more practical deployment of retrieval systems that can address reasoning-intensive queries.
翻译:经典的“检索—重排序”级联流程存在召回率受限问题,其根源在于第一阶段检索器的局限性。当前多数方法通过改进第一阶段检索器来解决召回率受限问题,但这会带来巨大的训练和推理成本,尤其是在处理需要大量推理的查询时尤为显著。为规避基于推理的检索器的计算开销,我们在BRIGHT推理密集型检索基准上复现了GAR(基于图的自适应重排序)的研究成果。GAR通过迭代探索语料库图来修正重排序过程本身,从而解决召回率受限问题,但此前仅被测试于针对主题类查询和问答类查询设计的模型。因此,我们在推理密集型场景下使用推理型与非推理型重排序模型复现了GAR。研究发现,重排序器信号的质量在识别语料库图中更多相关文档方面具有重要作用。总体而言,GAR在几乎不增加计算开销的前提下,有效提升了多种模型在推理密集型检索中的效果。最终,本工作使得能够处理推理密集型查询的检索系统更易于实际部署。