A common retrieve-and-rerank paradigm involves retrieving a broad set of relevant candidates using a scalable bi-encoder, followed by expensive but more accurate cross-encoders to a limited candidate set. However, this small subset often leads to error propagation from the bi-encoders, thereby restricting the performance of the overall pipeline. To address these issues, we propose the Comparing Multiple Candidates (CMC) framework, which compares a query and multiple candidate embeddings jointly through shallow self-attention layers. While providing contextualized representations, CMC is scalable enough to handle multiple comparisons simultaneously, where comparing 2K candidates takes only twice as long as comparing 100. Practitioners can use CMC as a lightweight and effective reranker to improve top-1 accuracy. Moreover, when integrated with another retriever, CMC reranking can function as a virtually enhanced retriever. This configuration adds only negligible latency compared to using a single retriever (virtual), while significantly improving recall at K (enhanced).} Through experiments, we demonstrate that CMC, as a virtually enhanced retriever, significantly improves Recall@k (+6.7, +3.5%-p for R@16, R@64) compared to the initial retrieval stage on the ZeSHEL dataset. Meanwhile, we conduct experiments for direct reranking on entity, passage, and dialogue ranking. The results indicate that CMC is not only faster (11x) than cross-encoders but also often more effective, with improved prediction performance in Wikipedia entity linking (+0.7%-p) and DSTC7 dialogue ranking (+3.3%-p). The code and link to datasets are available at https://github.com/yc-song/cmc
翻译:常见的检索-重排序范式先使用可扩展的双编码器检索大量相关候选,再通过代价较高但更精确的交叉编码器对有限候选集进行重排序。然而,这种小规模子集往往导致双编码器的错误传播,从而限制整体流水线的性能。为解决这些问题,我们提出"比较多个候选"(CMC)框架,通过浅层自注意力层联合比较查询与多个候选嵌入。在提供上下文感知表示的同时,CMC具备足够的可扩展性,可同时处理多个比较任务——比较2000个候选的时间仅是比较100个候选的两倍。实践者可将CMC作为轻量级且高效的重排序器来提升top-1准确率。此外,当与另一检索器集成时,CMC重排序可作为虚拟增强检索器运行。该配置相比单独使用检索器(虚拟)仅增加可忽略的延迟,同时显著提升K召回率(增强)。实验表明,在ZeSHEL数据集上,CMC作为虚拟增强检索器相比初始检索阶段显著提升Recall@k(R@16提升6.7个百分点,R@64提升3.5个百分点)。同时,我们在实体、段落和对话排序任务上进行了直接重排序实验。结果表明,CMC不仅比交叉编码器快11倍,且通常更有效:在Wikipedia实体链接(+0.7个百分点)和DSTC7对话排序(+3.3个百分点)中预测性能均有提升。代码及数据集链接见https://github.com/yc-song/cmc。