Query expansion is a widely used technique to improve the recall of search systems. In this paper, we propose an approach to query expansion that leverages the generative abilities of Large Language Models (LLMs). Unlike traditional query expansion approaches such as Pseudo-Relevance Feedback (PRF) that relies on retrieving a good set of pseudo-relevant documents to expand queries, we rely on the generative and creative abilities of an LLM and leverage the knowledge inherent in the model. We study a variety of different prompts, including zero-shot, few-shot and Chain-of-Thought (CoT). We find that CoT prompts are especially useful for query expansion as these prompts instruct the model to break queries down step-by-step and can provide a large number of terms related to the original query. Experimental results on MS-MARCO and BEIR demonstrate that query expansions generated by LLMs can be more powerful than traditional query expansion methods.
翻译:查询扩展是一种广泛用于提升搜索系统召回率的技术。本文提出了一种利用大型语言模型(LLM)生成能力的查询扩展方法。与传统查询扩展方法(如伪相关反馈PRF)依赖检索一组良好的伪相关文档来扩展查询不同,我们的方法依赖于LLM的生成与创造性能力,并充分利用模型内在的知识。我们研究了多种不同的提示策略,包括零样本、少样本以及思维链(CoT)提示。我们发现CoT提示对查询扩展尤为有效,因为此类提示能引导模型逐步分解查询,并生成大量与原始查询相关的术语。在MS-MARCO和BEIR数据集上的实验结果表明,由LLM生成的查询扩展比传统方法更具优势。