Information Retrieval (IR) systems used in search and recommendation platforms frequently employ Learning-to-Rank (LTR) models to rank items in response to user queries. These models heavily rely on features derived from user interactions, such as clicks and engagement data. This dependence introduces cold start issues for items lacking user engagement and poses challenges in adapting to non-stationary shifts in user behavior over time. We address both challenges holistically as an online learning problem and propose BayesCNS, a Bayesian approach designed to handle cold start and non-stationary distribution shifts in search systems at scale. BayesCNS achieves this by estimating prior distributions for user-item interactions, which are continuously updated with new user interactions gathered online. This online learning procedure is guided by a ranker model, enabling efficient exploration of relevant items using contextual information provided by the ranker. We successfully deployed BayesCNS in a large-scale search system and demonstrated its efficacy through comprehensive offline and online experiments. Notably, an online A/B experiment showed a 10.60% increase in new item interactions and a 1.05% improvement in overall success metrics over the existing production baseline.
翻译:搜索与推荐平台中使用的信息检索系统通常采用学习排序模型来对用户查询返回的项目进行排序。这些模型严重依赖于从用户交互(如点击与参与数据)中提取的特征。这种依赖性导致缺乏用户交互的项目面临冷启动问题,并对适应用户行为随时间发生的非平稳性变化提出了挑战。我们将这两个挑战整体视为一个在线学习问题,并提出BayesCNS——一种专为大规模处理搜索系统中冷启动与非平稳分布偏移而设计的贝叶斯方法。BayesCNS通过估计用户-项目交互的先验分布来实现这一目标,这些先验分布会随着在线收集的新用户交互数据持续更新。该在线学习过程由排序器模型引导,使其能够利用排序器提供的上下文信息高效探索相关项目。我们已成功将BayesCNS部署于大规模搜索系统,并通过全面的离线和在线实验验证了其有效性。值得注意的是,一项在线A/B实验显示,相较于现有生产基线,新项目交互量提升了10.60%,整体成功指标提高了1.05%。