Query expansion has been proved to be effective in improving recall and precision of first-stage retrievers, and yet its influence on a complicated, state-of-the-art cross-encoder ranker remains under-explored. We first show that directly applying the expansion techniques in the current literature to state-of-the-art neural rankers can result in deteriorated zero-shot performance. To this end, we propose GFF, a pipeline that includes a large language model and a neural ranker, to Generate, Filter, and Fuse query expansions more effectively in order to improve the zero-shot ranking metrics such as nDCG@10. Specifically, GFF first calls an instruction-following language model to generate query-related keywords through a reasoning chain. Leveraging self-consistency and reciprocal rank weighting, GFF further filters and combines the ranking results of each expanded query dynamically. By utilizing this pipeline, we show that GFF can improve the zero-shot nDCG@10 on BEIR and TREC DL 2019/2020. We also analyze different modelling choices in the GFF pipeline and shed light on the future directions in query expansion for zero-shot neural rankers.
翻译:查询扩展已被证明能有效提升初阶段检索器的召回率与精确度,但其对复杂先进的交叉编码器排序器的影响尚未充分探索。我们首先发现,直接将现有文献中的扩展技术应用于最先进的神经排序器会导致零样本性能下降。为此,我们提出GFF流水线——包含大语言模型与神经排序器的协同框架,通过更有效的生成、过滤与融合查询扩展来提升nDCG@10等零样本排序指标。具体而言,GFF首先调用指令遵循语言模型,通过推理链生成查询相关关键词;随后利用自一致性原理与倒数排序权重,动态过滤并合并各扩展查询的排序结果。实验表明,应用该流水线可提升BEIR和TREC DL 2019/2020数据集上的零样本nDCG@10指标。我们同时分析了GFF流水线中的不同建模选择,为面向零样本神经排序器的查询扩展研究指明未来方向。