A Probabilistic Position Bias Model for Short-Video Recommendation Feeds

Modern web-based platforms show ranked lists of recommendations to users, attempting to maximise user satisfaction or business metrics. Typically, the goal of such systems boils down to maximising the exposure probability for items that are deemed "reward-maximising" according to a metric of interest. This general framing comprises streaming applications, as well as e-commerce or job recommendations, and even web search. Position bias or user models can be used to estimate exposure probabilities for each use-case, specifically tailored to how users interact with the presented rankings. A unifying factor in these diverse problem settings is that typically only one or several items will be engaged with (clicked, streamed,...) before a user leaves the ranked list. Short-video feeds on social media platforms diverge from this general framing in several ways, most notably that users do not tend to leave the feed after e.g. liking a post. Indeed, seemingly infinite feeds invite users to scroll further down the ranked list. For this reason, existing position bias or user models tend to fall short in such settings, as they do not accurately capture users' interaction modalities. In this work, we propose a novel and probabilistically sound personalised position bias model for feed recommendations. We focus on a 1st-level feed in a hierarchical structure, where users may enter a 2nd-level feed via any given 1st-level item. We posit that users come to the platform with a scrolling budget drawn according to some distribution, and show how the survival function of said distribution can be used to obtain closed-form estimates for personalised exposure probabilities. Empirical insights from a large-scale social media platform show how our probabilistic position bias model more accurately captures empirical exposure than existing models, and paves the way for unbiased evaluation and learning-to-rank.

翻译：现代网络平台向用户展示推荐内容的排序列表，旨在最大化用户满意度或业务指标。通常，此类系统的目标可归结为：根据所关注的指标，使被视为“奖励最大化”的内容获得更高的曝光概率。这一通用框架涵盖了流媒体应用、电子商务或职位推荐，甚至网络搜索。位置偏差或用户模型可用于估计每种使用场景下的曝光概率，并针对用户与排序列表的交互方式量身定制。这些多样化问题场景的一个共同点是，用户在离开排序列表前通常只会与一个或少数几个内容进行交互（点击、播放等）。社交媒体平台上的短视频流在多个方面偏离了这一通用框架，最显著的是用户不会因点赞帖子而离开推荐流。事实上，看似无限的推荐流会促使用户继续向下滚动排序列表。因此，现有的位置偏差或用户模型在此类场景中往往效果欠佳，因为它们未能准确捕捉用户的交互模式。在本文中，我们针对推荐流提出了一种新颖且概率上合理的个性化位置偏差模型。我们聚焦于层级结构中的一级推荐流，用户可通过任意一级内容进入二级推荐流。我们假设用户带着按某种分布抽取的“滚动预算”进入平台，并展示了如何利用该分布的生存函数来获得个性化曝光概率的闭合式估计值。来自大规模社交媒体平台的实证结果表明，与现有模型相比，我们的概率位置偏差模型能更准确地捕捉经验曝光，并为无偏评估和学习排序铺平了道路。