Large Language Models (LLMs) have emerged as powerful tools for passage reranking in information retrieval, leveraging their superior reasoning capabilities to address the limitations of conventional models on complex queries. However, current LLM-based reranking paradigms are fundamentally constrained by an efficiency-accuracy trade-off: (1) pointwise methods are efficient but ignore inter-document comparison, yielding suboptimal accuracy; (2) listwise methods capture global context but suffer from context-window constraints and prohibitive inference latency. To address these issues, we propose GroupRank, a novel paradigm that balances flexibility and context awareness. To unlock the full potential of groupwise reranking, we propose an answer-free data synthesis pipeline that fuses local pointwise signals with global listwise rankings. These samples facilitate supervised fine-tuning and reinforcement learning, with the latter guided by a specialized group-ranking reward comprising ranking-utility and group-alignment. These complementary components synergistically optimize document ordering and score calibration to reflect intrinsic query-document relevance. Experimental results show GroupRank achieves a state-of-the-art 65.2 NDCG@10 on BRIGHT and surpasses baselines by 2.1 points on R2MED, while delivering a 6.4$\times$ inference speedup.
翻译:大语言模型(LLMs)凭借其卓越的推理能力,已成为信息检索中段落重排序的有力工具,能够弥补传统模型在处理复杂查询时的局限性。然而,当前基于LLM的重排序范式受限于效率与准确性的权衡:(1)逐点方法效率高但忽略文档间比较,导致次优准确性;(2)列表方法虽能捕获全局上下文,却受限于上下文窗口约束且推理延迟过高。为解决这些问题,我们提出GroupRank——一种兼顾灵活性与上下文感知的新范式。为充分释放成组重排序的潜力,我们提出一种无答案数据合成流水线,融合局部逐点信号与全局列表排序,生成训练样本。这些样本支持监督式微调和强化学习,其中强化学习由专用成组排序奖励引导,包含排序效用与组对齐两项指标。互补组件协同优化文档排序与得分校准,以反映查询与文档的内在相关性。实验结果表明,GroupRank在BRIGHT数据集上以65.2 NDCG@10达到最优性能,在R2MED上超越基线2.1个百分点,同时实现6.4倍推理加速。