Current query expansion models use pseudo-relevance feedback to improve first-pass retrieval effectiveness; however, this fails when the initial results are not relevant. Instead of building a language model from retrieved results, we propose Generative Relevance Feedback (GRF) that builds probabilistic feedback models from long-form text generated from Large Language Models. We study the effective methods for generating text by varying the zero-shot generation subtasks: queries, entities, facts, news articles, documents, and essays. We evaluate GRF on document retrieval benchmarks covering a diverse set of queries and document collections, and the results show that GRF methods significantly outperform previous PRF methods. Specifically, we improve MAP between 5-19% and NDCG@10 17-24% compared to RM3 expansion, and achieve the best R@1k effectiveness on all datasets compared to state-of-the-art sparse, dense, and expansion models.
翻译:当前查询扩展模型利用伪相关性反馈提升首次检索效果,但初始结果不相关时该方法会失效。我们提出生成式相关性反馈(Generative Relevance Feedback, GRF),该方案不依赖于从检索结果构建语言模型,而是通过大语言模型生成的篇章级文本构建概率反馈模型。我们通过改变零样本生成子任务(查询、实体、事实、新闻文章、文档和论文)研究有效的文本生成方法。在涵盖多样化查询与文档集合的文档检索基准上评估GRF,结果表明其性能显著优于传统伪相关性反馈方法。具体而言,与RM3扩展相比,GRF将MAP指标提升5-19%,NDCG@10提升17-24%;在所有数据集上,与当前最先进的稀疏、稠密及扩展模型相比,GRF均实现最优R@1k效果。