Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Discovering the intended items of user queries from a massive repository of items is one of the main goals of an e-commerce search system. Relevance prediction is essential to the search system since it helps improve performance. When online serving a relevance model, the model is required to perform fast and accurate inference. Currently, the widely used models such as Bi-encoder and Cross-encoder have their limitations in accuracy or inference speed respectively. In this work, we propose a novel model called the Entity-Based Relevance Model (EBRM). We identify the entities contained in an item and decompose the QI (query-item) relevance problem into multiple QE (query-entity) relevance problems; we then aggregate their results to form the QI prediction using a soft logic formulation. The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy as well as cache QE predictions for fast online inference. Utilizing soft logic makes the prediction procedure interpretable and intervenable. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance. The proposed method is evaluated on labeled data from e-commerce websites. Empirical results show that it achieves promising improvements with computation efficiency.

翻译：从海量商品库中发现用户查询的目标商品是电商搜索系统的主要目标之一。相关性预测对搜索系统至关重要，因为它有助于提升性能。在在线服务相关性模型时，模型需要执行快速且准确的推理。当前广泛使用的模型如双编码器和交叉编码器分别存在准确性或推理速度方面的局限性。在这项工作中，我们提出了一种新颖的模型——基于实体的相关性模型（EBRM）。我们识别商品中包含的实体，并将查询-商品（QI）相关性问题分解为多个查询-实体（QE）相关性子问题；然后通过软逻辑公式聚合子问题的结果，形成QI预测。这种分解使我们能够利用交叉编码器QE相关性模块实现高准确性，同时通过缓存QE预测结果实现快速在线推理。利用软逻辑使预测过程可解释且可干预。我们还表明，利用用户日志中自动生成的QE数据对QE模块进行预训练可进一步提升整体性能。所提方法已在电商网站标注数据上进行了评估。实验结果表明，该方法在计算效率下取得了有前景的性能提升。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/