Supervised ranking methods based on bi-encoder or cross-encoder architectures have shown success in multi-stage text ranking tasks, but they require large amounts of relevance judgments as training data. In this work, we propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data. Different from the existing pointwise ranking methods, where documents are scored independently and ranked according to the scores, LRL directly generates a reordered list of document identifiers given the candidate documents. Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker to improve the top-ranked results of a pointwise method for improved efficiency. Additionally, we apply our approach to subsets of MIRACL, a recent multilingual retrieval dataset, with results showing its potential to generalize across different languages.
翻译:基于双编码器或交叉编码器架构的有监督排序方法在多阶段文本排序任务中取得了成功,但它们需要大量相关性判断作为训练数据。在这项工作中,我们提出了基于大语言模型的列表式重排序器(LRL),该方法无需使用任何特定任务的训练数据即可实现强大的重排序效果。与现有逐点排序方法(即对文档进行独立评分并根据分数排序)不同,LRL在给定候选文档后直接生成重新排序的文档标识符列表。在三个TREC网络搜索数据集上的实验表明,LRL不仅在对第一阶段检索结果进行重排序时优于零样本逐点方法,还可以作为最终阶段的重排序器,改进逐点方法中排名靠前的结果,从而提高效率。此外,我们将该方法应用于MIRACL(一个近年发布的多语言检索数据集)的子集,结果表明其具有跨语言泛化的潜力。