Precisely understanding users' contextual search intent has been an important challenge for conversational search. As conversational search sessions are much more diverse and long-tailed, existing methods trained on limited data still show unsatisfactory effectiveness and robustness to handle real conversational search scenarios. Recently, large language models (LLMs) have demonstrated amazing capabilities for text generation and conversation understanding. In this work, we present a simple yet effective prompting framework, called LLM4CS, to leverage LLMs as a text-based search intent interpreter to help conversational search. Under this framework, we explore three prompting methods to generate multiple query rewrites and hypothetical responses, and propose to aggregate them into an integrated representation that can robustly represent the user's real contextual search intent. Extensive automatic evaluations and human evaluations on three widely used conversational search benchmarks, including CAsT-19, CAsT-20, and CAsT-21, demonstrate the remarkable performance of our simple LLM4CS framework compared with existing methods and even using human rewrites. Our findings provide important evidence to better understand and leverage LLMs for conversational search.
翻译:精确理解用户的上下文搜索意图一直是对话搜索中的重要挑战。由于对话搜索会话更具多样性和长尾特性,现有基于有限数据训练的方法在处理真实对话搜索场景时仍表现出不尽如人意的有效性和鲁棒性。近年来,大型语言模型(LLMs)在文本生成和对话理解方面展现了惊人的能力。本文提出一种简单而有效的提示框架(LLM4CS),将大型语言模型作为基于文本的搜索意图解释器用于辅助对话搜索。在该框架下,我们探索了三种提示方法以生成多个查询重写与假设回应,并提出将这些结果聚合为一种综合表示,该表示能够鲁棒地表达用户真实的上下文搜索意图。在CAsT-19、CAsT-20和CAsT-21这三个广泛使用的对话搜索基准上进行的广泛自动评估与人工评估表明,与现有方法甚至人工重写相比,我们简洁的LLM4CS框架展现了卓越性能。我们的发现为更好地理解并利用大型语言模型进行对话搜索提供了重要依据。