Common document ranking pipelines in search systems are cascade systems that involve multiple ranking layers to integrate different information step-by-step. In this paper, we propose a novel re-ranker Fusion-in-T5 (FiT5), which integrates text matching information, ranking features, and global document information into one single unified model via templated-based input and global attention. Experiments on passage ranking benchmarks MS MARCO and TREC DL show that FiT5, as one single model, significantly improves ranking performance over complex cascade pipelines. Analysis finds that through attention fusion, FiT5 jointly utilizes various forms of ranking information via gradually attending to related documents and ranking features, and improves the detection of subtle nuances. Our code is open-sourced at https://github.com/OpenMatch/FiT5.
翻译:常见搜索引擎中的文档排序流程采用级联系统,通过多个排序层级逐步整合不同信息。本文提出新型重排序模型Fusion-in-T5(FiT5),该模型通过模板化输入与全局注意力机制,将文本匹配信息、排序特征及全局文档信息统一集成至单一模型中。在段落排序基准MS MARCO和TREC DL上的实验表明,FiT5作为单一模型,其排序性能显著优于复杂级联管道。分析发现,通过注意力融合机制,FiT5能够逐步关注相关文档与排序特征,联合利用多种形式的排序信息,并提升对细微差别的检测能力。我们的代码已开源于https://github.com/OpenMatch/FiT5。