Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Discovering the intended items of user queries from a massive repository of items is one of the main goals of an e-commerce search system. Relevance prediction is essential to the search system since it helps improve performance. When online serving a relevance model, the model is required to perform fast and accurate inference. Currently, the widely used models such as Bi-encoder and Cross-encoder have their limitations in accuracy or inference speed respectively. In this work, we propose a novel model called the Entity-Based Relevance Model (EBRM). We identify the entities contained in an item and decompose the QI (query-item) relevance problem into multiple QE (query-entity) relevance problems; we then aggregate their results to form the QI prediction using a soft logic formulation. The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy as well as cache QE predictions for fast online inference. Utilizing soft logic makes the prediction procedure interpretable and intervenable. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance. The proposed method is evaluated on labeled data from e-commerce websites. Empirical results show that it achieves promising improvements with computation efficiency.

翻译：从海量商品库中识别用户查询的目标商品是电商搜索系统的主要目标之一。相关性预测对于搜索系统至关重要，因为它有助于提升性能。在线部署相关性模型时，要求模型能够进行快速且准确的推理。当前广泛使用的模型如双编码器和交叉编码器，分别在准确率或推理速度方面存在局限性。本文提出了一种名为基于实体的相关性模型（EBRM）的新方法。我们识别商品中的实体，并将查询-商品（QI）相关性问题分解为多个查询-实体（QE）相关性问题；然后通过软逻辑公式聚合结果以形成QI预测。这种分解允许我们使用交叉编码器QE相关性模块实现高准确率，同时缓存QE预测结果以实现快速在线推理。利用软逻辑使预测过程具有可解释性和可干预性。我们还证明，利用用户日志中的自动生成QE数据对QE模块进行预训练，可以进一步提升整体性能。该方法在电商网站标注数据上进行了评估。实验结果表明，它在保持计算效率的同时取得了显著的改进。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/