Cross encoders (CEs) are trained with sentence pairs to detect relatedness. As CEs require sentence pairs at inference, the prevailing view is that they can only be used as re-rankers in information retrieval pipelines. Dual encoders (DEs) are instead used to embed sentences, where sentence pairs are encoded by two separate encoders with shared weights at training, and a loss function that ensures the pair's embeddings lie close in vector space if the sentences are related. DEs however, require much larger datasets to train, and are less accurate than CEs. We report a curious finding that embeddings from earlier layers of CEs can in fact be used within an information retrieval pipeline. We show how to exploit CEs to distill a lighter-weight DE, with a 5.15x speedup in inference time.
翻译:交叉编码器(CEs)通过句子对训练以检测相关性。由于交叉编码器在推理时需要句子对,主流观点认为它们仅能作为信息检索流程中的重排序器使用。相比之下,双编码器(DEs)被用于生成句子嵌入,其训练过程通过两个权重共享的独立编码器处理句子对,并借助损失函数确保相关句子的嵌入在向量空间中彼此接近。然而,双编码器需要更大的训练数据集,且准确率低于交叉编码器。我们报告了一项有趣的发现:交叉编码器较早层的嵌入实际上可用于信息检索流程。我们展示了如何利用交叉编码器蒸馏出更轻量级的双编码器,实现推理速度提升5.15倍。