Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval

In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. It is important for DR models to balance both efficiency and effectiveness. Pre-trained language models (PLMs), especially Transformer-based PLMs, have been proven to be effective encoders of DR models. However, the self-attention component in Transformer-based PLM results in a computational complexity that grows quadratically with sequence length, and thus exhibits a slow inference speed for long-text retrieval. Some recently proposed non-Transformer PLMs, especially the Mamba architecture PLMs, have demonstrated not only comparable effectiveness to Transformer-based PLMs on generative language tasks but also better efficiency due to linear time scaling in sequence length. This paper implements the Mamba Retriever to explore whether Mamba can serve as an effective and efficient encoder of DR model for IR tasks. We fine-tune the Mamba Retriever on the classic short-text MS MARCO passage ranking dataset and the long-text LoCoV0 dataset. Experimental results show that (1) on the MS MARCO passage ranking dataset and BEIR, the Mamba Retriever achieves comparable or better effectiveness compared to Transformer-based retrieval models, and the effectiveness grows with the size of the Mamba model; (2) on the long-text LoCoV0 dataset, the Mamba Retriever can extend to longer text length than its pre-trained length after fine-tuning on retrieval task, and it has comparable or better effectiveness compared to other long-text retrieval models; (3) the Mamba Retriever has superior inference speed for long-text retrieval. In conclusion, Mamba Retriever is both effective and efficient, making it a practical model, especially for long-text retrieval.

翻译：在信息检索领域，稠密检索模型利用深度学习技术将查询和段落编码到嵌入空间中，以计算其语义关系。对于稠密检索模型而言，平衡效率与有效性至关重要。预训练语言模型，特别是基于Transformer的预训练语言模型，已被证明是稠密检索模型的有效编码器。然而，基于Transformer的预训练语言模型中的自注意力组件会导致计算复杂度随序列长度呈二次方增长，因此在长文本检索中表现出较慢的推理速度。一些最近提出的非Transformer预训练语言模型，尤其是基于Mamba架构的预训练语言模型，不仅在生成式语言任务上表现出与基于Transformer的预训练语言模型相当的有效性，而且由于在序列长度上具有线性时间缩放特性，还具备更高的效率。本文实现了Mamba Retriever，以探究Mamba能否作为信息检索任务中稠密检索模型的有效且高效的编码器。我们在经典的短文本MS MARCO段落排序数据集和长文本LoCoV0数据集上对Mamba Retriever进行了微调。实验结果表明：（1）在MS MARCO段落排序数据集和BEIR上，Mamba Retriever相比基于Transformer的检索模型取得了相当或更好的有效性，且其有效性随Mamba模型规模的增大而提升；（2）在长文本LoCoV0数据集上，Mamba Retriever在检索任务微调后能够扩展到比其预训练长度更长的文本，并且相比其他长文本检索模型具有相当或更好的有效性；（3）Mamba Retriever在长文本检索中具有优越的推理速度。综上所述，Mamba Retriever兼具有效性与高效性，使其成为一个实用的模型，尤其适用于长文本检索。