Effective query-item relevance modeling is pivotal for enhancing user experience and safeguarding user satisfaction in e-commerce search systems. Recently, benefiting from the vast inherent knowledge, Large Language Model (LLM) approach demonstrates strong performance and long-tail generalization ability compared with previous neural-based specialized relevance learning methods. Though promising, current LLM-based methods encounter the following inadequacies in practice: First, the massive parameters and computational demands make it difficult to be deployed online. Second, distilling LLM models to online models is a feasible direction, but the LLM relevance modeling is a black box, and its rich intrinsic knowledge is difficult to extract and apply online. To improve the interpretability of LLM and boost the performance of online relevance models via LLM, we propose an Explainable LLM-driven Multi-dimensional Distillation framework for e-commerce relevance learning, which comprises two core components: (1) An Explainable LLM for relevance modeling (ELLM-rele), which decomposes the relevance learning into intermediate steps and models relevance learning as a Chain-of-Thought (CoT) reasoning, thereby enhancing both interpretability and performance of LLM. (2) A Multi-dimensional Knowledge Distillation (MKD) architecture that transfers the knowledge of ELLM-rele to current deployable interaction-based and representation-based student models from both the relevance score distribution and CoT reasoning aspects. Through distilling the probabilistic and CoT reasoning knowledge, MKD improves both the semantic interaction and long-tail generalization abilities of student models. Extensive offline evaluations and online experiments on Taobao search ad scene demonstrate that our proposed framework significantly enhances e-commerce relevance learning performance and user experience.
翻译:有效的查询-商品相关性建模对于提升电商搜索系统的用户体验和保障用户满意度至关重要。近期,得益于其庞大的内在知识,大语言模型(LLM)方法相较于以往基于神经网络的专用相关性学习方法,展现出强大的性能和长尾泛化能力。尽管前景广阔,当前基于LLM的方法在实践中仍存在以下不足:首先,其海量参数和计算需求使得在线部署困难。其次,将LLM模型蒸馏至在线模型是一个可行的方向,但LLM的相关性建模是一个黑箱,其丰富的内在知识难以在线提取和应用。为了提高LLM的可解释性,并借助LLM提升在线相关性模型的性能,我们提出了一种用于电商相关性学习的可解释LLM驱动多维度蒸馏框架,该框架包含两个核心组件:(1)一个用于相关性建模的可解释LLM(ELLM-rele),它将相关性学习分解为中间步骤,并将相关性学习建模为思维链(CoT)推理,从而同时提升LLM的可解释性和性能。(2)一个多维度知识蒸馏(MKD)架构,它从相关性分数分布和CoT推理两个方面,将ELLM-rele的知识迁移到当前可部署的基于交互和基于表示的学生模型中。通过蒸馏概率性和CoT推理知识,MKD提升了学生模型的语义交互能力和长尾泛化能力。在淘宝搜索广告场景上进行的大量离线评估和在线实验表明,我们提出的框架显著提升了电商相关性学习性能和用户体验。