Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System

Current recommendation systems are significantly affected by a serious issue of temporal data shift, which is the inconsistency between the distribution of historical data and that of online data. Most existing models focus on utilizing updated data, overlooking the transferable, temporal data shift-free information that can be learned from shifting data. We propose the Temporal Invariance of Association theorem, which suggests that given a fixed search space, the relationship between the data and the data in the search space keeps invariant over time. Leveraging this principle, we designed a retrieval-based recommendation system framework that can train a data shift-free relevance network using shifting data, significantly enhancing the predictive performance of the original model in the recommendation system. However, retrieval-based recommendation models face substantial inference time costs when deployed online. To address this, we further designed a distill framework that can distill information from the relevance network into a parameterized module using shifting data. The distilled model can be deployed online alongside the original model, with only a minimal increase in inference time. Extensive experiments on multiple real datasets demonstrate that our framework significantly improves the performance of the original model by utilizing shifting data.

翻译：当前推荐系统受到时序数据偏移这一严重问题的显著影响，即历史数据分布与在线数据分布之间的不一致性。现有模型大多侧重于利用更新数据，却忽视了可从偏移数据中学习的、可迁移的时序偏移无关信息。我们提出了关联时序不变性定理，该定理表明在固定搜索空间下，数据与搜索空间中数据之间的关联关系随时间保持恒定。基于此原理，我们设计了一个基于检索的推荐系统框架，该框架能够利用偏移数据训练出数据偏移无关的相关性网络，从而显著提升推荐系统中原始模型的预测性能。然而，基于检索的推荐模型在线上部署时面临巨大的推理时间开销。为解决此问题，我们进一步设计了蒸馏框架，能够利用偏移数据将相关性网络中的信息蒸馏至参数化模块中。蒸馏模型可与原始模型共同在线部署，且仅带来极小的推理时间增长。在多个真实数据集上的大量实验表明，我们的框架通过利用偏移数据，显著提升了原始模型的性能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/