In modern recommendation systems, the standard pipeline involves training machine learning models on historical data to predict user behaviors and improve recommendations continuously. However, these data training loops can introduce interference in A/B tests, where data generated by control and treatment algorithms, potentially with different distributions, are combined. To address these challenges, we introduce a novel approach called weighted training. This approach entails training a model to predict the probability of each data point appearing in either the treatment or control data and subsequently applying weighted losses during model training. We demonstrate that this approach achieves the least variance among all estimators that do not cause shifts in the training distributions. Through simulation studies, we demonstrate the lower bias and variance of our approach compared to other methods.
翻译:在现代推荐系统中,标准流程依赖于在历史数据上训练机器学习模型,以预测用户行为并持续优化推荐效果。然而,这些数据训练循环可能在A/B测试中引入干扰——实验组与对照组算法生成的数据(可能具有不同的分布特征)会被混合使用。针对这些问题,我们提出了一种名为"加权训练"的创新方法。该方法通过训练模型预测每条数据出现在实验组或对照组中的概率,进而在模型训练过程中应用加权损失函数。我们证明,在所有不引起训练分布偏移的估计器中,该方法实现了最小方差。通过仿真研究,我们验证了该方法相较于其他方法具有更低的偏差与方差。