Click-through-rate (CTR) prediction has an essential impact on improving user experience and revenue in e-commerce search. With the development of deep learning, graph-based methods are well exploited to utilize graph structure extracted from user behaviors and other information to help embedding learning. However, most of the previous graph-based methods mainly focus on recommendation scenarios, and therefore their graph structures highly depend on item's sequential information from user behaviors, ignoring query's sequential signal and query-item correlation. In this paper, we propose a new approach named Light-weight End-to-End Graph Interest Network (EGIN) to effectively mine users' search interests and tackle previous challenges. (i) EGIN utilizes query and item's correlation and sequential information from the search system to build a heterogeneous graph for better CTR prediction in e-commerce search. (ii) EGIN's graph embedding learning shares the same training input and is jointly trained with CTR prediction, making the end-to-end framework effortless to deploy in large-scale search systems. The proposed EGIN is composed of three parts: query-item heterogeneous graph, light-weight graph sampling, and multi-interest network. The query-item heterogeneous graph captures correlation and sequential information of query and item efficiently by the proposed light-weight graph sampling. The multi-interest network is well designed to utilize graph embedding to capture various similarity relationships between query and item to enhance the final CTR prediction. We conduct extensive experiments on both public and industrial datasets to demonstrate the effectiveness of the proposed EGIN. At the same time, the training cost of graph learning is relatively low compared with the main CTR prediction task, ensuring efficiency in practical applications.
翻译:点击率预测对于提升电商搜索的用户体验和收入具有重要影响。随着深度学习的发展,基于图的方法被广泛用于利用从用户行为及其他信息中提取的图结构来辅助嵌入学习。然而,先前大多数基于图的方法主要关注推荐场景,因此其图结构高度依赖于用户行为中的物品序列信息,忽略了查询的序列信号以及查询与物品之间的相关性。本文提出了一种名为轻量级端到端图兴趣网络的新方法,以有效挖掘用户的搜索兴趣并应对先前的挑战。(i)EGIN利用搜索系统中查询与物品的相关性及序列信息构建异构图,以提升电商搜索中的点击率预测性能。(ii)EGIN的图嵌入学习共享相同的训练输入,并与点击率预测任务联合训练,使得该端到端框架能够轻松部署于大规模搜索系统中。所提出的EGIN由三部分组成:查询-物品异构图、轻量级图采样以及多兴趣网络。查询-物品异构图通过提出的轻量级图采样高效捕获查询与物品的相关性和序列信息。多兴趣网络经过精心设计,利用图嵌入来捕捉查询与物品之间的多种相似性关系,从而增强最终的点击率预测。我们在公开数据集和工业数据集上进行了大量实验,验证了所提出EGIN的有效性。同时,图学习的训练成本相较于主点击率预测任务相对较低,确保了实际应用中的效率。