Dynamic Ranked List Truncation for Reranking Pipelines via LLM-generated Reference-Documents

Large Language Models (LLM) have been widely used in reranking. Computational overhead and large context lengths remain a challenging issue for LLM rerankers. Efficient reranking usually involves selecting a subset of the ranked list from the first stage, known as ranked list truncation (RLT). The truncated list is processed further by a reranker. For LLM rerankers, the ranked list is often partitioned and processed sequentially in batches to reduce the context length. Both these steps involve hyperparameters and topic-agnostic heuristics. Recently, LLMs have been shown to be effective for relevance judgment. Equivalently, we propose that LLMs can be used to generate reference documents that can act as a pivot between relevant and non-relevant documents in a ranked list. We propose methods to use these generated reference documents for RLT as well as for efficient listwise reranking. While reranking, we process the ranked list in either parallel batches of non-overlapping windows or overlapping windows with adaptive strides, improving the existing fixed stride setup. The generated reference documents are also shown to improve existing efficient listwise reranking frameworks. Experiments on TREC Deep Learning benchmarks show that our approach outperforms existing RLT-based approaches. In-domain and out-of-domain benchmarks demonstrate that our proposed methods accelerate LLM-based listwise reranking by up to 66\% compared to existing approaches. This work not only establishes a practical paradigm for efficient LLM-based reranking but also provides insight into the capability of LLMs to generate semantically controlled documents using relevance signals.

翻译：大语言模型（LLM）已被广泛应用于重排序任务，但计算开销与长上下文长度仍是LLM重排序器面临的核心挑战。高效重排序通常需要从第一阶段排序结果中选取子集，即排序列表截断（RLT）。截断后的列表将由重排序器进一步处理。对于LLM重排序器，排序列表常被分块并按序批量处理以缩短上下文长度。这两类步骤均涉及超参数与主题无关的启发式策略。近期研究表明，LLM在相关性判断方面表现优异。据此，我们提出LLM可生成参考文档，作为排序列表中相关文档与非相关文档间的基准锚点。我们设计了利用这些生成参考文档实现RLT及高效列表式重排序的方法。在重排序过程中，我们通过非重叠窗口的并行批处理或采用自适应步长的重叠窗口处理排序列表，改进了现有固定步长方案。实验证明，生成的参考文档还能提升现有高效列表式重排序框架的性能。在TREC深度学习基准测试中，我们的方法优于现有基于RLT的方案。域内与跨域基准测试表明，与现有方法相比，我们提出的方法可将基于LLM的列表式重排序速度提升高达66%。本研究不仅建立了基于LLM的高效重排序实用范式，还揭示了LLM通过相关性信号生成语义可控文档的能力。