Conversational passage retrieval is challenging as it often requires the resolution of references to previous utterances and needs to deal with the complexities of natural language, such as coreference and ellipsis. To address these challenges, pre-trained sequence-to-sequence neural query rewriters are commonly used to generate a single de-contextualized query based on conversation history. Previous research shows that combining multiple query rewrites for the same user utterance has a positive effect on retrieval performance. We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently. The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost. Our contributions further include devising ways to utilize multi-query rewrites in both sparse and dense first-pass retrieval. We demonstrate that applying our approach on top of a standard passage retrieval pipeline delivers state-of-the-art performance without sacrificing efficiency.
翻译:对话式段落检索具有挑战性,因为它通常需要解析对先前话语的指代,并需处理自然语言的复杂性,如共指和省略。为应对这些挑战,通常使用预训练的序列到序列神经查询重写器,基于对话历史生成单个去上下文化的查询。先前研究表明,对同一用户话语结合多个查询重写能对检索性能产生积极影响。我们提出使用神经查询重写器生成多个查询,并展示了如何在段落检索流程中高效整合这些查询。我们方法的主要优势在于其简洁性:它利用了束搜索算法的工作机制,能够在不增加额外成本的情况下生成多个查询重写。我们的贡献还包括设计了在稀疏和稠密首轮检索中利用多查询重写的方法。实验证明,在标准段落检索流程基础上应用我们的方法,可在不牺牲效率的前提下实现最先进的性能。