User queries in information retrieval are often ambiguous, making it challenging for systems to identify a user's target from a single query. While recent dialogue-based interactive retrieval systems can clarify user intent, they are inefficient as they often lack an explicit strategy to ask the most informative questions. To address this limitation, we propose SherlockLLM, a dialogue-driven retrieval framework that learns an optimal questioning strategy via Reinforcement Learning (RL) and avoids the need for large-scale annotated dialogue data. In our framework, an agent is trained to generate a sequence of binary questions to efficiently narrow down the search space. To validate our approach, we introduce a benchmark with both structured and unstructured tasks. Experimental results show that SherlockLLM is a robust and efficient solution. On the structured tasks, its performance matches strong baselines and approaches the theoretical optimal defined by binary search. On the challenging unstructured task, our agent significantly outperforms these baselines, showcasing its ability to learn a highly effective information-seeking dialogue policy.
翻译:信息检索中的用户查询常常具有歧义性,这使得系统难以通过单一查询识别用户的目标。虽然近期基于对话的交互式检索系统能够澄清用户意图,但由于通常缺乏明确的策略来提出信息量最大的问题,它们的效率不高。为了解决这一局限,我们提出了SherlockLLM,一个通过强化学习学习最优提问策略的对话驱动检索框架,并避免了对大规模标注对话数据的需求。在我们的框架中,训练一个智能体生成一系列二元问题,以高效地缩小搜索空间。为了验证我们的方法,我们引入了一个包含结构化和非结构化任务的基准测试。实验结果表明,SherlockLLM是一个鲁棒且高效的解决方案。在结构化任务上,其性能与强基线模型相当,并接近二分搜索定义的理论最优值。在具有挑战性的非结构化任务上,我们的智能体显著优于这些基线模型,展示了其学习高效信息寻求对话策略的能力。