We study ranked list truncation (RLT) from a novel "retrieve-then-re-rank" perspective, where we optimize re-ranking by truncating the retrieved list (i.e., trim re-ranking candidates). RLT is crucial for re-ranking as it can improve re-ranking efficiency by sending variable-length candidate lists to a re-ranker on a per-query basis. It also has the potential to improve re-ranking effectiveness. Despite its importance, there is limited research into applying RLT methods to this new perspective. To address this research gap, we reproduce existing RLT methods in the context of re-ranking, especially newly emerged large language model (LLM)-based re-ranking. In particular, we examine to what extent established findings on RLT for retrieval are generalizable to the "retrieve-then-re-rank" setup from three perspectives: (i) assessing RLT methods in the context of LLM-based re-ranking with lexical first-stage retrieval, (ii) investigating the impact of different types of first-stage retrievers on RLT methods, and (iii) investigating the impact of different types of re-rankers on RLT methods. We perform experiments on the TREC 2019 and 2020 deep learning tracks, investigating 8 RLT methods for pipelines involving 3 retrievers and 2 re-rankers. We reach new insights into RLT methods in the context of re-ranking.
翻译:我们从新颖的"检索-再重排序"视角研究排序列表截断(RLT)技术,通过截断检索列表(即精简重排序候选集)来优化重排序过程。RLT对重排序至关重要,因为它能针对每个查询向重排序器发送可变长度的候选列表,从而提升重排序效率,并具备改善重排序效果的潜力。然而,尽管其重要性显著,将RLT方法应用于这一新视角的研究仍十分有限。为填补这一研究空白,我们在重排序场景下复现了现有RLT方法,特别针对新兴的基于大语言模型(LLM)的重排序技术。具体而言,我们从三个维度检验检索领域RLT经典结论在"检索-再重排序"设置中的可迁移性:(i)在基于词项的一阶段检索结合LLM重排序的框架下评估RLT方法;(ii)探究不同类型一阶段检索器对RLT方法的影响;(iii)探究不同类型重排序器对RLT方法的影响。我们在TREC 2019和2020深度学习赛道数据集上开展实验,针对包含3种检索器与2种重排序器的流水线,系统评估了8种RLT方法,最终获得关于重排序场景下RLT方法的新见解。