Recent studies have highlighted the significant potential of Large Language Models (LLMs) as zero-shot relevance rankers. These methods predominantly utilize prompt learning to assess the relevance between queries and documents by generating a ranked list of potential documents. Despite their promise, the substantial costs associated with LLMs pose a significant challenge for their direct implementation in commercial search systems. To overcome this barrier and fully exploit the capabilities of LLMs for text ranking, we explore techniques to transfer the ranking expertise of LLMs to a more compact model similar to BERT, using a ranking loss to enable the deployment of less resource-intensive models. Specifically, we enhance the training of LLMs through Continued Pre-Training, taking the query as input and the clicked title and summary as output. We then proceed with supervised fine-tuning of the LLM using a rank loss, assigning the final token as a representative of the entire sentence. Given the inherent characteristics of autoregressive language models, only the final token </s> can encapsulate all preceding tokens. Additionally, we introduce a hybrid point-wise and margin MSE loss to transfer the ranking knowledge from LLMs to smaller models like BERT. This method creates a viable solution for environments with strict resource constraints. Both offline and online evaluations have confirmed the efficacy of our approach, and our model has been successfully integrated into a commercial web search engine as of February 2024.
翻译:近期研究突显了大型语言模型作为零样本相关性排序器的巨大潜力。这些方法主要利用提示学习,通过生成潜在文档的排序列表来评估查询与文档之间的相关性。尽管前景广阔,但大型语言模型的高昂成本对其在商业搜索系统中的直接部署构成了重大挑战。为突破这一障碍并充分发挥大型语言模型在文本排序中的能力,我们探索了将其排序能力迁移至类似BERT的紧凑模型的技术,通过排序损失函数实现资源消耗更低的模型部署。具体而言,我们通过持续预训练增强大型语言模型的训练,以查询作为输入,以点击标题和摘要作为输出。随后使用排序损失对大型语言模型进行监督微调,将最终标记作为整个句子的表征。鉴于自回归语言模型的固有特性,仅有最终标记</s>能够封装所有先前的标记。此外,我们引入了混合点式与边际均方误差损失函数,将排序知识从大型语言模型迁移至BERT等较小模型。该方法为资源严格受限的环境提供了可行的解决方案。离线和在线评估均证实了我们方法的有效性,且截至2024年2月,我们的模型已成功集成到商业网页搜索引擎中。