The late interaction paradigm introduced with ColBERT stands out in the neural Information Retrieval space, offering a compelling effectiveness-efficiency trade-off across many benchmarks. Efficient late interaction retrieval is based on an optimized multi-step strategy, where an approximate search first identifies a set of candidate documents to re-rank exactly. In this work, we introduce SPLATE, a simple and lightweight adaptation of the ColBERTv2 model which learns an ``MLM adapter'', mapping its frozen token embeddings to a sparse vocabulary space with a partially learned SPLADE module. This allows us to perform the candidate generation step in late interaction pipelines with traditional sparse retrieval techniques, making it particularly appealing for running ColBERT in CPU environments. Our SPLATE ColBERTv2 pipeline achieves the same effectiveness as the PLAID ColBERTv2 engine by re-ranking 50 documents that can be retrieved under 10ms.
翻译:ColBERT引入的后期交互范式在神经信息检索领域独树一帜,在众多基准测试中展现出令人信服的效果-效率权衡。高效的后期交互检索基于优化的多步策略,其中近似检索首先识别出一组待精确重排的候选文档。本文提出SPLATE,这是对ColBERTv2模型的一种简单轻量的适配方案:通过部分学习的SPLADE模块,将模型冻结的token嵌入映射到稀疏词汇空间,从而学习"MLM适配器"。这使得我们能够利用传统稀疏检索技术在后期交互流水线中执行候选生成步骤,尤其适合在CPU环境中运行ColBERT。我们的SPLATE ColBERTv2流水线通过重排可在10毫秒内检索到的50篇文档,达到了与PLAID ColBERTv2引擎相同的检索效果。