Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking

Reranking algorithms have made progress in improving document retrieval quality by efficiently aggregating relevance judgments generated by large language models (LLMs). However, identifying relevant documents for queries that require in-depth reasoning remains a major challenge. Reasoning-intensive queries often exhibit multifaceted information needs and nuanced interpretations, rendering document relevance inherently context dependent. To address this, we propose contextual relevance, which we define as the probability that a document is relevant to a given query, marginalized over the distribution of different reranking contexts it may appear in (i.e., the set of candidate documents it is ranked alongside and the order in which the documents are presented to a reranking model). While prior works have studied methods to mitigate the positional bias LLMs exhibit by accounting for the ordering of documents, we empirically find that the compositions of these batches also plays an important role in reranking performance. To efficiently estimate contextual relevance, we propose TS-SetRank, a sampling-based, uncertainty-aware reranking algorithm. Empirically, TS-SetRank improves nDCG@10 over retrieval and reranking baselines by 15-25% on BRIGHT and 6-21% on BEIR, highlighting the importance of modeling relevance as context-dependent.

翻译：重排序算法通过高效聚合大语言模型（LLMs）生成的相关性判断，在提升文档检索质量方面取得了进展。然而，对于需要深度推理的查询，识别相关文档仍是一个主要挑战。推理密集型查询通常表现出多方面的信息需求和细微的解释差异，使得文档相关性本质上依赖于上下文。为解决这一问题，我们提出了上下文相关性，将其定义为文档在给定查询下相关的概率，该概率边际化于文档可能出现的不同重排序上下文分布之上（即文档被排序时所处的候选文档集合，以及文档呈现给重排序模型的顺序）。尽管先前的研究已经探讨了通过考虑文档顺序来缓解LLMs所表现出的位置偏差的方法，但我们通过实证发现，这些批次的组成也对重排序性能起着重要作用。为了高效估计上下文相关性，我们提出了TS-SetRank，一种基于采样的、不确定性感知的重排序算法。实证结果表明，TS-SetRank在BRIGHT数据集上将nDCG@10相对于检索和重排序基线提升了15-25%，在BEIR数据集上提升了6-21%，这突显了将相关性建模为上下文依赖的重要性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

专知会员服务

7+阅读 · 2025年7月20日

【ICCV2025】具有局部对齐视觉-语言模型的可解释零样本学习

专知会员服务

10+阅读 · 2025年7月1日

【NeurIPS2023】半监督端到端对比学习用于时间序列分类

专知会员服务

37+阅读 · 2023年10月17日

【KDD2023】协同过滤的高效联合超参数和架构搜索

专知会员服务

23+阅读 · 2023年7月23日