In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries. We introduce CHIQ, a two-step method that leverages the capabilities of LLMs to resolve ambiguities in the conversation history before query rewriting. This approach contrasts with prior studies that predominantly use closed-source LLMs to directly generate search queries from conversation history. We demonstrate on five well-established benchmarks that CHIQ leads to state-of-the-art results across most settings, showing highly competitive performances with systems leveraging closed-source LLMs. Our study provides a first step towards leveraging open-source LLMs in conversational search, as a competitive alternative to the prevailing reliance on commercial LLMs. Data, models, and source code will be publicly available upon acceptance at https://github.com/fengranMark/CHIQ.
翻译:本文研究如何有效部署开源大语言模型以改进对话式搜索中的查询重写,特别是针对歧义性查询。我们提出CHIQ,一种两步方法,利用LLM在查询重写前消除对话历史中的歧义。该方法区别于以往主要使用闭源LLM直接从对话历史生成搜索查询的研究。我们在五个公认基准测试上证明,CHIQ在大多数设置下达到最先进结果,展现出与基于闭源LLM系统的高度竞争性能。本研究为在对话式搜索中利用开源LLM迈出第一步,将其作为当前依赖商业LLM的主流方案的竞争性替代方案。论文接收后,数据、模型及源代码将在https://github.com/fengranMark/CHIQ公开提供。