Semantic relevance calculation is crucial for e-commerce search engines, as it ensures that the items selected closely align with customer intent. Inadequate attention to this aspect can detrimentally affect user experience and engagement. Traditional text-matching techniques are prevalent but often fail to capture the nuances of search intent accurately, so neural networks now have become a preferred solution to processing such complex text matching. Existing methods predominantly employ representation-based architectures, which strike a balance between high traffic capacity and low latency. However, they exhibit significant shortcomings in generalization and robustness when compared to interaction-based architectures. In this work, we introduce a robust interaction-based modeling paradigm to address these shortcomings. It encompasses 1) a dynamic length representation scheme for expedited inference, 2) a professional terms recognition method to identify subjects and core attributes from complex sentence structures, and 3) a contrastive adversarial training protocol to bolster the model's robustness and matching capabilities. Extensive offline evaluations demonstrate the superior robustness and effectiveness of our approach, and online A/B testing confirms its ability to improve relevance in the same exposure position, resulting in more clicks and conversions. To the best of our knowledge, this method is the first interaction-based approach for large e-commerce search relevance calculation. Notably, we have deployed it for the entire search traffic on alibaba.com, the largest B2B e-commerce platform in the world.
翻译:语义相关性计算对于电商搜索引擎至关重要,它能确保所选商品与顾客意图紧密匹配。对此方面的关注不足会对用户体验和参与度产生不利影响。传统的文本匹配技术虽广泛应用,但往往难以精准捕捉搜索意图的细微差别,因此神经网络现已成为处理此类复杂文本匹配问题的优选方案。现有方法主要采用基于表示的架构,在承载高流量与保持低延迟之间取得了平衡。然而,与基于交互的架构相比,它们在泛化能力和鲁棒性方面表现出明显不足。在本工作中,我们提出了一种鲁棒的基于交互的建模范式以解决这些缺陷。该范式包含:1) 一种用于加速推理的动态长度表示方案;2) 一种从复杂句子结构中识别主体与核心属性的专业术语识别方法;3) 一种对比对抗训练协议,以增强模型的鲁棒性与匹配能力。大量的离线评估证明了我们方法在鲁棒性和有效性上的优越性,在线A/B测试也证实了其能在相同曝光位置提升相关性,从而获得更多点击与转化。据我们所知,此方法是首个用于大型电商搜索相关性计算的基于交互的方法。值得注意的是,我们已将其部署在全球最大的B2B电商平台阿里巴巴国际站的全部搜索流量中。