We propose EAR, a query Expansion And Reranking approach for improving passage retrieval, with the application to open-domain question answering. EAR first applies a query expansion model to generate a diverse set of queries, and then uses a query reranker to select the ones that could lead to better retrieval results. Motivated by the observation that the best query expansion often is not picked by greedy decoding, EAR trains its reranker to predict the rank orders of the gold passages when issuing the expanded queries to a given retriever. By connecting better the query expansion model and retriever, EAR significantly enhances a traditional sparse retrieval method, BM25. Empirically, EAR improves top-5/20 accuracy by 3-8 and 5-10 points in in-domain and out-of-domain settings, respectively, when compared to a vanilla query expansion model, GAR, and a dense retrieval model, DPR.
翻译:我们提出EAR(查询扩展与重排序方法),旨在改进面向开放域问答的段落检索。EAR首先应用查询扩展模型生成多样化的查询集合,再通过查询重排序器选择能产生更优检索结果的查询。基于"最优查询扩展往往无法通过贪心解码获取"这一观察,EAR训练其重排序器以预测:在给定检索器上执行扩展查询后,黄金段落的排序顺序。通过强化查询扩展模型与检索器的衔接,EAR显著提升了传统稀疏检索方法BM25的性能。实验表明,与基础查询扩展模型GAR及稠密检索模型DPR相比,在域内和跨域场景下,EAR分别将top-5/20准确率提升了3-8个点和5-10个点。