Modern search systems rely on a fast first stage retriever to fetch relevant items from a massive catalog of items. Deployed search systems often use user engagement signals to supervise bi-encoder retriever training at scale, because these signals are continuously logged from real traffic and require no additional annotation effort. However, engagement is an imperfect proxy for semantic relevance. Items may receive interactions due to popularity, promotion, attractive visuals, titles, or price, despite weak query-item relevance. These limitations are further accentuated in Walmart's e-commerce sponsored search. User engagement on ad items is often structurally sparse because the frequency with which an ad is shown depends on factors beyond relevance such as whether the advertiser is currently running that ad, the outcome of the auction for available ad slots, bid competitiveness, and advertiser budget. Thus, even highly relevant query ad pairs can have limited engagement signals simply due to limited impressions. We propose a bi-encoder training framework for Walmart's sponsored search retrieval in e-commerce that uses semantic relevance as the primary supervision signal, with engagement used only as a preference signal among relevant items. Concretely, we construct a context-rich training target by combining 1. graded relevance labels from a cascade of cross-encoder teacher models, 2. a multichannel retrieval prior score derived from the rank positions and cross-channel agreement of retrieval systems running in production, and 3. user engagement applied only to semantically relevant items to refine preferences. Our approach outperforms the current production system in both offline evaluation and online AB tests, yielding consistent gains in average relevance and NDCG.
翻译:现代搜索系统依赖快速的初筛检索器从海量商品目录中获取相关条目。部署的搜索系统常利用用户参与信号对大规模双编码器检索器进行训练,因为这些信号可从真实流量中持续记录且无需额外的人工标注成本。然而,用户参与度是语义相关性的不完美替代指标:商品可能因流行度、促销、吸引人的视觉效果、标题或价格而获得交互,尽管查询-商品相关性较弱。这些局限性在沃尔玛电商赞助搜索中更为突出。广告商品上的用户参与信号在结构上往往稀疏,因为广告展示频率取决于相关性之外的诸多因素,例如广告主是否正在投放该广告、可用广告位的竞价结果、竞价竞争力及广告主预算。因此,即使查询与广告间高度相关,也可能因展示机会有限而仅有少量参与信号。我们提出了一种针对沃尔玛电商赞助搜索检索的双编码器训练框架,将语义相关性作为主要监督信号,用户参与仅用作相关商品间的偏好信号。具体而言,我们通过组合三类信号构建富含上下文的训练目标:1)来自级联交叉编码器教师模型的分级相关性标签;2)基于线上运行检索系统的排序位置与跨通道一致性导出的多通道检索先验分数;3)仅对语义相关商品应用的用户参与信号以细化偏好。本方法在离线评估和在线A/B测试中均优于现有生产系统,在平均相关性和NDCG指标上均取得一致性提升。