推荐系统稳定性与可塑性的度量方法 (Measuring the stability and plasticity of recommender systems)

The typical offline protocol to evaluate recommendation algorithms is to collect a dataset of user-item interactions and then use a part of this dataset to train a model, and the remaining data to measure how closely the model recommendations match the observed user interactions. This protocol is straightforward, useful and practical, but it only captures performance of a particular model trained at some point in the past. We know, however, that online systems evolve over time. In general, it is a good idea that models reflect such changes, so models are frequently retrained with recent data. But if this is the case, to what extent can we trust previous evaluations? How will a model perform when a different pattern (re)emerges? In this paper we propose a methodology to study how recommendation models behave when they are retrained. The idea is to profile algorithms according to their ability to, on the one hand, retain past patterns - stability - and, on the other hand, (quickly) adapt to changes - plasticity. We devise an offline evaluation protocol that provides detail on the long-term behavior of models, and that is agnostic to datasets, algorithms and metrics. To illustrate the potential of this framework, we present preliminary results of three different types of algorithms on the GoodReads dataset that suggest different stability and plasticity profiles depending on the algorithmic technique, and a possible trade-off between stability and plasticity. Although additional experiments will be necessary to confirm these observations, they already illustrate the usefulness of the proposed framework to gain insights on the long term dynamics of recommendation models.

翻译：评估推荐算法的典型离线协议是收集用户-物品交互数据集，使用部分数据训练模型，并利用剩余数据衡量模型推荐结果与观测到的用户交互行为的匹配程度。该协议直接、实用且高效，但仅能捕捉特定历史时间点训练所得模型的性能表现。然而，我们认识到在线系统会随时间动态演化。理想情况下，模型应能反映此类变化，因此系统常使用近期数据对模型进行重训练。但由此引出的问题是：我们能在多大程度上信赖既往的评估结果？当不同模式（重新）出现时，模型将如何表现？本文提出一种研究推荐模型在重训练过程中行为特性的方法论。其核心思想是从两个维度对算法进行剖析：一方面考察其保持历史模式的能力——即稳定性；另一方面评估其（快速）适应变化的能力——即可塑性。我们设计了一种与数据集、算法及评价指标无关的离线评估协议，该协议能够揭示模型的长期行为特征。为展示该框架的潜力，我们在GoodReads数据集上对三类算法进行了初步实验，结果表明不同算法技术呈现出差异化的稳定性与可塑性特征，且二者之间存在潜在的权衡关系。虽然仍需进一步实验验证这些发现，但现有结果已证明该框架对于深入理解推荐模型长期动态特性的实用价值。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日