Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solution; and (3) feedback derived from the corpus is most beneficial when utilizing candidate documents from a strong first-stage retriever. Together, our findings provide a better understanding of which elements in the PRF design space are most important.
翻译:基于大语言模型(LLMs)构建的伪相关反馈(PRF)方法可以从两个关键设计维度进行组织:反馈源(即反馈文本的来源)和反馈模型(即如何利用给定的反馈文本来优化查询表示)。然而,这两个维度各自所起的独立作用尚不明确,因为在实证评估中二者常常相互交织。本文通过受控实验,系统地研究了反馈源和反馈模型的选择如何影响PRF的有效性,以弥补这一研究空白。我们在13个低资源BEIR任务上对五种LLM PRF方法进行了评估,结果表明:(1)反馈模型的选择对PRF的有效性起着关键作用;(2)仅从LLM生成的文本中获取反馈是性价比最高的解决方案;(3)当利用来自强大第一阶段检索器的候选文档时,从语料库中获取反馈最为有益。综上,我们的研究结果有助于更好地理解PRF设计空间中哪些要素最为重要。