Personalized Negative Reservoir for Incremental Learning in Recommender Systems

Recommender systems have become an integral part of online platforms. Every day the volume of training data is expanding and the number of user interactions is constantly increasing. The exploration of larger and more expressive models has become a necessary pursuit to improve user experience. However, this progression carries with it an increased computational burden. In commercial settings, once a recommendation system model has been trained and deployed it typically needs to be updated frequently as new client data arrive. Cumulatively, the mounting volume of data is guaranteed to eventually make full batch retraining of the model from scratch computationally infeasible. Naively fine-tuning solely on the new data runs into the well-documented problem of catastrophic forgetting. Despite the fact that negative sampling is a crucial part of training with implicit feedback, no specialized technique exists that is tailored to the incremental learning framework. In this work, we take the first step to propose, a personalized negative reservoir strategy which is used to obtain negative samples for the standard triplet loss. This technique balances alleviation of forgetting with plasticity by encouraging the model to remember stable user preferences and selectively forget when user interests change. We derive the mathematical formulation of a negative sampler to populate and update the reservoir. We integrate our design in three SOTA and commonly used incremental recommendation models. We show that these concrete realizations of our negative reservoir framework achieve state-of-the-art results in standard benchmarks, on multiple standard top-k evaluation metrics.

翻译：推荐系统已成为在线平台不可或缺的组成部分。随着训练数据体量的每日扩展和用户交互数量的持续增长，探索更大规模且更具表达力的模型成为提升用户体验的必要手段。然而，这一发展过程伴随着计算负荷的显著增加。在商业场景中，推荐系统模型完成训练并部署后，通常需要随新客户数据的到达而频繁更新。累积的数据量终将使模型的完整批量重训练在计算上变得不可行。仅对新数据直接微调则会遭遇已被充分记录的灾难性遗忘问题。尽管负采样是基于隐式反馈训练的关键环节，目前尚无专门针对增量学习框架的定制化技术。作为该领域首项探索，本文提出一种个性化负储备池策略，用于为标准三元组损失函数获取负样本。该技术通过鼓励模型记忆稳定的用户偏好，并在用户兴趣发生变化时有选择地遗忘，以平衡遗忘缓解与可塑性。我们推导了用于构建和更新负储备池的数学形式化模型，并将其集成到三种当前最先进且广泛使用的增量推荐模型中。实验证明，基于我们负储备池框架的具体实现方案，在多个标准Top-K评估指标上均达到了业界领先水平。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日