LTRR: Learning To Rank Retrievers for LLMs

Retrieval-Augmented Generation (RAG) systems typically rely on a single fixed retriever, despite growing evidence that no single retriever performs optimally across all query types. In this paper, we explore a query routing approach that dynamically selects from a pool of retrievers based on the query, using both train-free heuristics and learned routing models. We frame routing as a learning-to-rank problem and introduce LTRR, a framework that Learns To Rank Retrievers according to their expected contribution to downstream RAG performance. Through experiments on diverse question-answering benchmarks with controlled variations in query types, we demonstrate that routing-based RAG consistently surpasses the strongest single-retriever baselines. The gains are particularly substantial when training with the Answer Correctness (AC) objective and when using pairwise ranking methods, with XGBoost yielding the best results. Additionally, our approach exhibits stronger generalization to out-of-distribution queries. Overall, our results underscore the critical role of both training strategy and optimization metric choice in effective query routing for RAG systems.

翻译：检索增强生成（RAG）系统通常依赖单一固定检索器，尽管已有研究表明，没有任何检索器能在所有查询类型上表现最优。本文探索了一种基于查询动态选择检索器池中检索器的查询路由方法，该方法融合了无训练启发式策略与学习型路由模型。我们将路由问题构建为排序学习任务，并提出LTRR框架——该框架能根据各检索器对下游RAG性能的预期贡献进行排序学习。通过在具有可控查询类型变化的多样化问答基准上开展实验，我们证明基于路由的RAG系统始终优于最强单检索器基线。当采用答案正确性（AC）优化目标训练，并使用成对排序方法时（尤其以XGBoost方法效果最佳），性能提升尤为显著。此外，我们的方法在分布外查询上展现出更强的泛化能力。总体而言，研究结果强调了训练策略选择与优化指标选择对RAG系统高效查询路由的关键作用。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【AAAI2026】TruthfulRAG：基于知识图谱解决检索增强生成中的事实层冲突

专知会员服务

22+阅读 · 2025年11月15日

检索增强生成（RAG）技术，261页slides

专知会员服务

42+阅读 · 2025年10月16日

迈向可信的检索增强生成：大语言模型综述

专知会员服务

30+阅读 · 2025年2月12日