Large Language Models (LLMs) have emerged as powerful tools for passage reranking in information retrieval, leveraging their superior reasoning capabilities to address the limitations of conventional models on complex queries. However, current LLM-based reranking paradigms are fundamentally constrained by an efficiency-accuracy trade-off: (1) pointwise methods are efficient but ignore inter-document comparison, yielding suboptimal accuracy; (2) listwise methods capture global context but suffer from context-window constraints and prohibitive inference latency. To address these issues, we propose GroupRank, a novel paradigm that balances flexibility and context awareness. To unlock the full potential of groupwise reranking, we propose an answer-free data synthesis pipeline that fuses local pointwise signals with global listwise rankings. These samples facilitate supervised fine-tuning and reinforcement learning, with the latter guided by a specialized group-ranking reward comprising ranking-utility and group-alignment. These complementary components synergistically optimize document ordering and score calibration to reflect intrinsic query-document relevance. Experimental results show GroupRank achieves a state-of-the-art 65.2 NDCG@10 on BRIGHT and surpasses baselines by 2.1 points on R2MED, while delivering a 6.4$\times$ inference speedup.
翻译:大语言模型凭借其卓越的推理能力,已成为信息检索中段落重排序的有力工具,可应对传统模型在处理复杂查询时的局限性。然而,现有基于大语言模型的重排序范式从根本上受限于效率与准确性的权衡:(1)逐点方法效率较高但忽略了文档间比较,准确性欠佳;(2)列表方法虽能捕捉全局上下文,却受制于上下文窗口限制及高昂的推理时延。为解决上述问题,我们提出GroupRank这一新型范式,该范式在灵活性与上下文感知之间取得平衡。为充分发挥群组式重排序的潜力,我们提出一种无需答案的数据合成流程,该流程融合局部逐点信号与全局列表排序。所生成的样本支持监督微调与强化学习,其中强化学习由专门设计的群组排序奖励引导,该奖励包含排序效用与群组对齐两个子项。这些互补组件通过协同优化文档排序与分数校准,可有效反映查询与文档间的内在相关性。实验结果表明,GroupRank在BRIGHT数据集上达到65.2 NDCG@10的先进水平,在R2MED数据集上以2.1个百分点的优势超越基线方法,同时实现6.4倍的推理加速。