We introduce Rank1, the first reranking model trained to take advantage of test-time compute. Rank1 demonstrates the applicability within retrieval of using a reasoning language model (i.e. OpenAI's o1, Deepseek's R1, etc.) for distillation in order to rapidly improve the performance of a smaller model. We gather and open-source a dataset of more than 600,000 examples of R1 reasoning traces from queries and passages in MS MARCO. Models trained on this dataset show: (1) state-of-the-art performance on advanced reasoning and instruction following datasets; (2) work remarkably well out of distribution due to the ability to respond to user-input prompts; and (3) have explainable reasoning chains that can be given to users or RAG-based systems. Further, we demonstrate that quantized versions of these models retain strong performance while using less compute/memory. Overall, Rank1 shows that test-time compute allows for a fundamentally new type of explainable and performant reranker model for search.
翻译:我们提出了Rank1,这是首个训练用于利用测试时计算的重排序模型。Rank1展示了在检索领域应用推理语言模型(如OpenAI的o1、Deepseek的R1等)进行知识蒸馏,以快速提升较小模型性能的可行性。我们收集并开源了一个包含超过60万个来自MS MARCO查询与段落对的R1推理轨迹数据集。基于该数据集训练的模型表现出:(1)在高级推理和指令遵循数据集上达到最先进的性能;(2)由于能够响应用户输入的提示,在分布外数据上表现优异;(3)生成可解释的推理链,可提供给用户或基于检索增强生成(RAG)的系统。此外,我们证明了这些模型的量化版本在保持强劲性能的同时,减少了计算和内存开销。总体而言,Rank1表明测试时计算为搜索提供了一种全新的、可解释且高性能的重排序模型。