Industrial recommender systems face the challenge of operating in non-stationary environments, where data distribution shifts arise from evolving user behaviors over time. To tackle this challenge, a common approach is to periodically re-train or incrementally update deployed deep models with newly observed data, resulting in a continual training process. However, the conventional learning paradigm of neural networks relies on iterative gradient-based updates with a small learning rate, making it slow for large recommendation models to adapt. In this paper, we introduce ReLoop2, a self-correcting learning loop that facilitates fast model adaptation in online recommender systems through responsive error compensation. Inspired by the slow-fast complementary learning system observed in human brains, we propose an error memory module that directly stores error samples from incoming data streams. These stored samples are subsequently leveraged to compensate for model prediction errors during testing, particularly under distribution shifts. The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation. We evaluate the effectiveness of ReLoop2 on three open benchmark datasets as well as a real-world production dataset. The results demonstrate the potential of ReLoop2 in enhancing the responsiveness and adaptiveness of recommender systems operating in non-stationary environments.
翻译:工业推荐系统面临在非平稳环境中运行的挑战,其中数据分布偏移源于用户行为随时间演变。为解决此挑战,常见方法是利用新观测数据定期重新训练或增量更新已部署的深度模型,形成持续训练过程。然而,传统神经网络学习范式依赖于小学习率的迭代梯度更新,这使得大型推荐模型难以快速适应。本文提出ReLoop2——一种自校正学习循环,通过响应式误差补偿促进在线推荐系统的快速模型适应。受人类大脑中慢速-快速互补学习系统的启发,我们设计了一个误差记忆模块,直接存储来自输入数据流的误差样本。这些存储样本随后用于在测试阶段补偿模型预测误差,特别是在分布偏移情况下。该误差记忆模块具备快速访问能力,并在模型服务阶段通过新观测数据样本持续刷新,以支持快速模型适应。我们在三个公开基准数据集及一个真实生产数据集上评估ReLoop2的有效性。结果表明,ReLoop2在增强非平稳环境中推荐系统的响应性和自适应性方面具有潜力。