Recent retrieval-augmented generation (RAG) approaches have demonstrated strong capability in handling complex queries, yet current research overlooks a critical challenge: different retrievers require fundamentally different query formulation strategies for optimal performance. In this work, we present the first systematic analysis of how LLMs can learn to adapt their query formulation strategies for different retrievers via reinforcement learning (RL). Our empirical study reveals that RL effectively teaches an LLM to tailor its queries to specific retriever characteristics. We discover that different retrievers exhibit surprisingly distinct optimal query styles (e.g., descriptive vs. question-like), suggesting strategies learned for one retriever ineffective for another. We further show that performance can be enhanced by incorporating retriever-specific human guidance and by scaling model size. To facilitate learning over multi-retrieval-step trajectories, we introduce a branching-based rollout technique that improves training stability. Our work provides the first empirical evidence and actionable insights for building truly retriever-aware RAG systems. Code and resources are available at https://github.com/LCO-Embedding/Envs-aware-Information-Retrieval.
翻译:近期基于检索增强生成(RAG)的方法在处理复杂查询方面展现了强大的能力,然而当前的研究忽视了一个关键挑战:不同的检索器需要根本上不同的查询构建策略才能达到最优性能。在本工作中,我们首次系统分析了大型语言模型(LLM)如何通过强化学习(RL)学习为不同检索器调整其查询构建策略。我们的实证研究表明,RL能够有效教导LLM根据特定检索器的特性定制查询。我们发现,不同检索器在最优查询风格(例如,描述型vs.问题型)上存在显著差异,这意味着为一种检索器学习的策略对另一种检索器无效。我们进一步证明,通过融入检索器特定的人类指导以及扩展模型规模可以提升性能。为促进在多检索步骤轨迹上的学习,我们引入了一种基于分支的展开技术,该技术提升了训练稳定性。我们的工作为构建真正检索器感知的RAG系统提供了首批经验证据和可操作见解。代码和资源可在https://github.com/LCO-Embedding/Envs-aware-Information-Retrieval获取。