I examine a conceptual model of a recommendation system (RS) with user inflow and churn dynamics. When inflow and churn balance out, the user distribution reaches a steady state. Changing the recommendation algorithm alters the steady state and creates a transition period. During this period, the RS behaves differently from its new steady state. In particular, A/B experiment metrics obtained in transition periods are biased indicators of the RS's long term performance. Scholars and practitioners, however, often conduct A/B tests shortly after introducing new algorithms to validate their effectiveness. This A/B experiment paradigm, widely regarded as the gold standard for assessing RS improvements, may consequently yield false conclusions. I also briefly discuss the data bias caused by the user retention dynamics.
翻译:我研究了具有用户流入和流失动态的推荐系统(RS)的概念模型。当流入与流失达到平衡时,用户分布进入稳态。改变推荐算法会改变稳态并产生过渡期。在此过渡期内,推荐系统的行为与其新稳态下的表现存在差异。特别地,在过渡期内获得的A/B实验指标是对推荐系统长期性能的有偏估计。然而,学者和从业者常在引入新算法后立即进行A/B测试以验证其有效性。这种被广泛视为评估推荐系统改进黄金标准的A/B实验范式,可能因此得出错误结论。我还简要讨论了用户保留动态所导致的数据偏差。