Retrieval-augmented generation (RAG) systems typically rely on a single retriever and a single set of hyperparameters, despite facing highly heterogeneous queries that range from simple factoid questions to complex multi-hop reasoning. We propose a method that automatically selects a small, diverse subset of retrievers (a portfolio) from a large pool of candidates, to cover different regions of the target query distribution. We formalize this setting via an expected best-of-$k$ objective over the query distribution and show that it admits an efficient portfolio construction algorithm with near-optimal guarantees. Across multiple QA benchmarks, our learned portfolios and router pipeline consistently outperform single-retriever and naive multi-retriever baselines on both retrieval metrics and answer quality. In addition, compared to inference-time hyperparameter tuning approaches, fixed portfolios enable parallel retrieval and LLM calls, achieving comparable (and sometimes better) accuracy with substantially lower latency and token cost.
翻译:检索增强生成(RAG)系统通常依赖单一检索器和一组固定超参数,然而它们面临的查询高度异构,涵盖从简单事实性问题到复杂多跳推理的多种类型。我们提出一种方法,能从大量候选检索器中自动选择一组小型、多样化的检索器组合(portfolio),以覆盖目标查询分布的不同区域。我们通过查询分布上的期望最优-$k$目标形式化这一设定,并证明其存在一种具有近似最优保证的高效组合构建算法。在多个问答基准测试中,我们学习的检索器组合与路由流水线在检索指标和答案质量上均一致优于单一检索器和朴素多检索器基线。此外,与推理时超参数调优方法相比,固定组合支持并行检索和LLM调用,在显著降低延迟和token成本的同时,能达到可比(有时甚至更优)的准确率。