Embedding-based Retrieval Models (ERMs) have emerged as a promising framework for large-scale text retrieval problems due to powerful large language models. Nevertheless, fine-tuning ERMs to reach state-of-the-art results can be expensive due to the extreme scale of data as well as the complexity of multi-stages pipelines (e.g., pre-training, fine-tuning, distillation). In this work, we propose the PEFA framework, namely ParamEter-Free Adapters, for fast tuning of ERMs without any backward pass in the optimization. At index building stage, PEFA equips the ERM with a non-parametric k-nearest neighbor (kNN) component. At inference stage, PEFA performs a convex combination of two scoring functions, one from the ERM and the other from the kNN. Based on the neighborhood definition, PEFA framework induces two realizations, namely PEFA-XL (i.e., extra large) using double ANN indices and PEFA-XS (i.e., extra small) using a single ANN index. Empirically, PEFA achieves significant improvement on two retrieval applications. For document retrieval, regarding Recall@100 metric, PEFA improves not only pre-trained ERMs on Trivia-QA by an average of 13.2%, but also fine-tuned ERMs on NQ-320K by an average of 5.5%, respectively. For product search, PEFA improves the Recall@100 of the fine-tuned ERMs by an average of 5.3% and 14.5%, for PEFA-XS and PEFA-XL, respectively. Our code is available at https://github.com/amzn/pecos/tree/mainline/examples/pefa-wsdm24.
翻译:嵌入检索模型(ERM)借助强大的大语言模型,已成为大规模文本检索问题中极具前景的框架。然而,由于数据规模极端庞大,且多阶段流水线(如预训练、微调、蒸馏)复杂度高,微调ERM以达到最先进结果代价高昂。本文提出PEFA框架(即无参数适配器),可在优化过程中无需任何反向传播即可快速调优ERM。在索引构建阶段,PEFA为ERM配备非参数化k近邻(kNN)组件;在推理阶段,PEFA对来自ERM和kNN的两个评分函数进行凸组合。基于邻域定义的不同,PEFA框架衍生出两种实现形式:使用双重ANN索引的PEFA-XL(超大版本)和使用单一ANN索引的PEFA-XS(超小版本)。实验表明,PEFA在两个检索应用中均取得显著改进。对于文档检索,在Recall@100指标上,PEFA不仅使预训练ERM在Trivia-QA上的性能平均提升13.2%,还使微调ERM在NQ-320K上的性能平均提升5.5%。对于产品搜索,PEFA-XS和PEFA-XL分别使微调ERM的Recall@100指标平均提升5.3%和14.5%。我们的代码已开源在 https://github.com/amzn/pecos/tree/mainline/examples/pefa-wsdm24。