Information Retrieval (IR) systems used in search and recommendation platforms frequently employ Learning-to-Rank (LTR) models to rank items in response to user queries. These models heavily rely on features derived from user interactions, such as clicks and engagement data. This dependence introduces cold start issues for items lacking user engagement and poses challenges in adapting to non-stationary shifts in user behavior over time. We address both challenges holistically as an online learning problem and propose BayesCNS, a Bayesian approach designed to handle cold start and non-stationary distribution shifts in search systems at scale. BayesCNS achieves this by estimating prior distributions for user-item interactions, which are continuously updated with new user interactions gathered online. This online learning procedure is guided by a ranker model, enabling efficient exploration of relevant items using contextual information provided by the ranker. We successfully deployed BayesCNS in a large-scale search system and demonstrated its efficacy through comprehensive offline and online experiments. Notably, an online A/B experiment showed a 10.60% increase in new item interactions and a 1.05% improvement in overall success metrics over the existing production baseline.
翻译:搜索与推荐平台中使用的信息检索系统通常采用学习排序模型,以根据用户查询对项目进行排序。这些模型严重依赖于从用户交互(如点击与参与数据)中提取的特征。这种依赖性导致缺乏用户交互的项目面临冷启动问题,并给适应随时间变化的用户行为非平稳性偏移带来挑战。我们将这两项挑战整体视为在线学习问题,并提出BayesCNS——一种专为大规模处理搜索系统中冷启动与非平稳分布偏移而设计的贝叶斯方法。BayesCNS通过估计用户-项目交互的先验分布来实现这一目标,该分布会随着在线收集的新用户交互持续更新。此在线学习过程由排序模型引导,能够利用排序器提供的上下文信息高效探索相关项目。我们已成功将BayesCNS部署于大规模搜索系统,并通过全面的离线和在线实验验证了其有效性。值得注意的是,一项在线A/B实验显示,相较于现有生产基线,新项目交互量提升了10.60%,整体成功指标改善了1.05%。