Industrial recommender systems face the challenge of operating in non-stationary environments, where data distribution shifts arise from evolving user behaviors over time. To tackle this challenge, a common approach is to periodically re-train or incrementally update deployed deep models with newly observed data, resulting in a continual training process. However, the conventional learning paradigm of neural networks relies on iterative gradient-based updates with a small learning rate, making it slow for large recommendation models to adapt. In this paper, we introduce ReLoop2, a self-correcting learning loop that facilitates fast model adaptation in online recommender systems through responsive error compensation. Inspired by the slow-fast complementary learning system observed in human brains, we propose an error memory module that directly stores error samples from incoming data streams. These stored samples are subsequently leveraged to compensate for model prediction errors during testing, particularly under distribution shifts. The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation. We evaluate the effectiveness of ReLoop2 on three open benchmark datasets as well as a real-world production dataset. The results demonstrate the potential of ReLoop2 in enhancing the responsiveness and adaptiveness of recommender systems operating in non-stationary environments.
翻译:工业推荐系统面临在非平稳环境中运行的挑战,在此类环境中,用户行为随时间演变导致数据分布发生偏移。为应对这一挑战,常用方法是利用新观测数据定期重新训练或增量更新已部署的深度模型,从而形成持续训练过程。然而,神经网络的传统学习范式依赖于小学习率的迭代梯度更新,这使得大型推荐模型的适应速度缓慢。本文介绍了ReLoop2——一种自纠正学习循环,通过响应式误差补偿促进在线推荐系统的快速模型适应。受人类大脑中慢速-快速互补学习系统的启发,我们提出了一种误差记忆模块,该模块直接从输入数据流中存储误差样本。这些存储的样本随后用于在测试阶段(尤其是在分布偏移下)补偿模型预测误差。误差记忆模块具备快速访问能力,并在模型服务阶段通过新观测数据样本持续刷新,以支持快速模型适应。我们在三个公开基准数据集以及一个实际生产数据集上评估了ReLoop2的有效性。结果表明,ReLoop2在增强非平稳环境中推荐系统的响应性和适应性方面具有潜力。