LLM-based listwise passage reranking has attracted attention for its effectiveness in ranking candidate passages. However, these models suffer from positional bias, where passages positioned towards the end of the input are less likely to be moved to top positions in the ranking. We hypothesize that there are two primary sources of positional bias: (1) architectural bias inherent in LLMs and (2) the imbalanced positioning of relevant documents. To address this, we propose DebiasFirst, a method that integrates positional calibration and position-aware data augmentation during fine-tuning. Positional calibration uses inverse propensity scoring to adjust for positional bias by re-weighting the contributions of different positions in the loss function when training. Position-aware augmentation augments training data to ensure that each passage appears equally across varied positions in the input list. This approach markedly enhances both effectiveness and robustness to the original ranking across diverse first-stage retrievers, reducing the dependence of NDCG@10 performance on the position of relevant documents. DebiasFirst also complements the inference-stage debiasing methods, offering a practical solution for mitigating positional bias in reranking.
翻译:基于大语言模型(LLM)的列表级段落重排因其在候选段落排序中的有效性而受到关注。然而,这类模型存在位置偏差问题:位于输入末尾的段落被移至排名前列的可能性较低。我们假设位置偏差主要源于两个因素:(1)大语言模型固有的架构偏差;(2)相关文档位置分布不均衡。为解决此问题,我们提出DebiasFirst方法,该方法在微调过程中融合了位置校准与位置感知数据增强技术。位置校准通过逆倾向性评分对损失函数中不同位置的贡献进行重新加权,以调整位置偏差;位置感知增强则通过扩充训练数据,确保每个段落均匀出现在输入列表的各个位置。该方法在不同类型的第一阶段检索器上显著提升了排序效果与鲁棒性,降低了NDCG@10指标对相关文档位置的依赖。DebiasFirst还可与推理阶段的去偏方法互补使用,为缓解重排中的位置偏差提供了实用解决方案。