Despite considerable progress in neural relevance ranking techniques, search engines still struggle to process complex queries effectively - both in terms of precision and recall. Sparse and dense Pseudo-Relevance Feedback (PRF) approaches have the potential to overcome limitations in recall, but are only effective with high precision in the top ranks. In this work, we tackle the problem of search over complex queries using three complementary techniques. First, we demonstrate that applying a strong neural re-ranker before sparse or dense PRF can improve the retrieval effectiveness by 5-8%. This improvement in PRF effectiveness can be attributed directly to improving the precision of the feedback set. Second, we propose an enhanced expansion model, Latent Entity Expansion (LEE), which applies fine-grained word and entity-based relevance modelling incorporating localized features. Specifically, we find that by including both words and entities for expansion achieve a further 2-8% improvement in NDCG. Our analysis also demonstrated that LEE is largely robust to its parameters across datasets and performs well on entity-centric queries. And third, we include an 'adaptive' component in the retrieval process, which iteratively refines the re-ranking pool during scoring using the expansion model and avoids re-ranking additional documents. We find that this combination of techniques achieves the best NDCG, MAP and R@1000 results on the TREC Robust 2004 and CODEC document datasets, demonstrating a significant advancement in expansion effectiveness.
翻译:尽管神经相关性排序技术取得了显著进展,搜索引擎在处理复杂查询时仍面临挑战——无论是在精确度还是召回率方面。稀疏和稠密伪相关反馈(PRF)方法有潜力克服召回率的局限性,但仅对前几位高精度的检索结果有效。本文通过三种互补技术解决复杂查询搜索问题。首先,我们证明在稀疏或稠密PRF之前应用强大的神经重排序器,可将检索效果提升5-8%。这种PRF效果的提升可直接归因于反馈集精确度的提高。其次,我们提出增强型扩展模型——潜在实体扩展(LEE),该模型通过细粒度的词级和实体级相关性建模,融入局部化特征。具体而言,我们发现同时使用词语和实体进行扩展,可使NDCG指标进一步改善2-8%。分析还表明LEE对其参数具有跨数据集的鲁棒性,并在实体导向型查询上表现优异。第三,我们在检索过程中引入“自适应”组件,利用扩展模型在评分阶段迭代优化重排序池,避免对额外文档进行重排序。实验证明,这种技术组合在TREC Robust 2004和CODEC文档数据集上取得了最佳的NDCG、MAP和R@1000结果,标志着扩展效果的重大突破。