Modern search systems rely on a fast first stage retriever to fetch relevant items from a massive catalog of items. Deployed search systems often use user engagement signals to supervise bi-encoder retriever training at scale, because these signals are continuously logged from real traffic and require no additional annotation effort. However, engagement is an imperfect proxy for semantic relevance. Items may receive interactions due to popularity, promotion, attractive visuals, titles, or price, despite weak query-item relevance. These limitations are further accentuated in Walmart's e-commerce sponsored search. User engagement on ad items is often structurally sparse because the frequency with which an ad is shown depends on factors beyond relevance such as whether the advertiser is currently running that ad, the outcome of the auction for available ad slots, bid competitiveness, and advertiser budget. Thus, even highly relevant query ad pairs can have limited engagement signals simply due to limited impressions. We propose a bi-encoder training framework for Walmart's sponsored search retrieval in e-commerce that uses semantic relevance as the primary supervision signal, with engagement used only as a preference signal among relevant items. Concretely, we construct a context-rich training target by combining 1. graded relevance labels from a cascade of cross-encoder teacher models, 2. a multichannel retrieval prior score derived from the rank positions and cross-channel agreement of retrieval systems running in production, and 3. user engagement applied only to semantically relevant items to refine preferences. Our approach outperforms the current production system in both offline evaluation and online AB tests, yielding consistent gains in average relevance and NDCG.
翻译:现代搜索引擎依赖快速的第一阶段检索器,从海量商品目录中获取相关条目。部署的搜索系统通常利用用户参与信号,在大规模场景下监督双编码器检索器的训练,因为这些信号可从真实流量中持续记录,且无需额外的人工标注。然而,用户参与度作为语义相关性的代理指标存在缺陷。商品可能因流行度、促销活动、吸引力视觉设计、标题或价格获得用户互动,尽管其查询-商品相关性较弱。这些局限在沃尔玛电商赞助搜索中尤为突出。广告商品的用户参与度在结构上往往稀疏,因为广告展示频率取决于相关性之外的因素,例如广告主当前是否正在投放该广告、可用广告位的竞价结果、出价竞争力及广告主预算。因此,即使查询-广告对高度相关,也可能因展示量有限而缺乏足够的参与信号。我们提出一种用于沃尔玛电商赞助搜索检索的双编码器训练框架,该框架以语义相关性作为主要监督信号,仅将用户参与度作为相关商品间的偏好信号。具体而言,我们通过组合以下要素构建富含上下文的训练目标:1) 从级联交叉编码器教师模型中获得的分级相关性标签;2) 基于生产运行中检索系统的排序位置与跨通道一致性推导出的多通道检索先验分数;3) 仅作用于语义相关商品以优化偏好的用户参与度。我们的方法在离线评估和在线A/B测试中均优于当前生产系统,在平均相关性和NDCG指标上取得持续提升。