Queries with similar information needs tend to have similar document clicks, especially in biomedical literature search engines where queries are generally short and top documents account for most of the total clicks. Motivated by this, we present a novel architecture for biomedical literature search, namely Log-Augmented DEnse Retrieval (LADER), which is a simple plug-in module that augments a dense retriever with the click logs retrieved from similar training queries. Specifically, LADER finds both similar documents and queries to the given query by a dense retriever. Then, LADER scores relevant (clicked) documents of similar queries weighted by their similarity to the input query. The final document scores by LADER are the average of (1) the document similarity scores from the dense retriever and (2) the aggregated document scores from the click logs of similar queries. Despite its simplicity, LADER achieves new state-of-the-art (SOTA) performance on TripClick, a recently released benchmark for biomedical literature retrieval. On the frequent (HEAD) queries, LADER largely outperforms the best retrieval model by 39% relative NDCG@10 (0.338 v.s. 0.243). LADER also achieves better performance on the less frequent (TORSO) queries with 11% relative NDCG@10 improvement over the previous SOTA (0.303 v.s. 0.272). On the rare (TAIL) queries where similar queries are scarce, LADER still compares favorably to the previous SOTA method (NDCG@10: 0.310 v.s. 0.295). On all queries, LADER can improve the performance of a dense retriever by 24%-37% relative NDCG@10 while not requiring additional training, and further performance improvement is expected from more logs. Our regression analysis has shown that queries that are more frequent, have higher entropy of query similarity and lower entropy of document similarity, tend to benefit more from log augmentation.
翻译:具有相似信息需求的查询往往会产生相似的文档点击,尤其在生物医学文献搜索引擎中,查询通常较短且排名靠前的文档占据了绝大多数点击量。受此启发,我们提出了一种面向生物医学文献检索的新型架构——日志增强型稠密检索(LADER),该模块作为轻量级插件,通过检索相似训练查询的点击日志来增强稠密检索器。具体而言,LADER首先利用稠密检索器查找与给定查询相似的文档和查询,然后基于相似查询与输入查询的相似度,对其相关(被点击)文档进行加权打分。最终文档得分取(1)稠密检索器输出的文档相似度得分与(2)相似查询点击日志中聚合文档得分的均值。尽管结构简洁,LADER在近期发布的生物医学文献检索基准数据集TripClick上实现了新的最先进(SOTA)性能。对于高频(HEAD)查询,LADER以相对NDCG@10提升39%(0.338 vs. 0.243)的显著优势超越最佳检索模型;对于低频(TORSO)查询,LADER较先前SOTA实现11%的相对NDCG@10提升(0.303 vs. 0.272);即使在相似查询稀缺的稀有(TAIL)查询场景下,LADER仍优于先前SOTA方法(NDCG@10: 0.310 vs. 0.295)。面向全量查询,LADER无需额外训练即可使稠密检索器的NDCG@10相对提升24%-37%,且预期可通过积累更多日志实现进一步性能增益。回归分析表明,查询频率越高、查询相似性熵越大且文档相似性熵越小的查询,越能从日志增强中获益。