Invariant Collaborative Filtering to Popularity Distribution Shift

Collaborative Filtering (CF) models, despite their great success, suffer from severe performance drops due to popularity distribution shifts, where these changes are ubiquitous and inevitable in real-world scenarios. Unfortunately, most leading popularity debiasing strategies, rather than tackling the vulnerability of CF models to varying popularity distributions, require prior knowledge of the test distribution to identify the degree of bias and further learn the popularity-entangled representations to mitigate the bias. Consequently, these models result in significant performance benefits in the target test set, while dramatically deviating the recommendation from users' true interests without knowing the popularity distribution in advance. In this work, we propose a novel learning framework, Invariant Collaborative Filtering (InvCF), to discover disentangled representations that faithfully reveal the latent preference and popularity semantics without making any assumption about the popularity distribution. At its core is the distillation of unbiased preference representations (i.e., user preference on item property), which are invariant to the change of popularity semantics, while filtering out the popularity feature that is unstable or outdated. Extensive experiments on five benchmark datasets and four evaluation settings (i.e., synthetic long-tail, unbiased, temporal split, and out-of-distribution evaluations) demonstrate that InvCF outperforms the state-of-the-art baselines in terms of popularity generalization ability on real recommendations. Visualization studies shed light on the advantages of InvCF for disentangled representation learning. Our codes are available at https://github.com/anzhang314/InvCF.

翻译：协同过滤模型尽管取得了巨大成功，但由于流行度分布偏移——这一现象在现实场景中普遍存在且不可避免——其性能会严重下降。遗憾的是，大多数主流去偏策略并非直接解决协同过滤模型对多变流行度分布脆弱性的问题，而是需要预先知晓测试分布以识别偏差程度，进而学习与流行度纠缠的表示来缓解偏差。因此，这类模型在目标测试集上虽能获得显著性能提升，但在缺乏先验流行度分布信息时，会严重偏离用户真实兴趣。本文提出新颖的学习框架——不变性协同过滤（InvCF），无需对流行度分布做任何假设，即可发现能够忠实揭示潜在偏好与流行度语义的解耦表示。其核心在于蒸馏对流行度语义变化保持不变的、无偏的偏好表示（即用户对物品属性的偏好），同时滤除不稳定或过时的流行度特征。在五个基准数据集和四种评估设置（合成长尾、无偏、时间分割及分布外评估）上的大量实验表明，InvCF在实际推荐场景中的流行度泛化能力优于现有最先进基线。可视化研究进一步阐明了InvCF在解耦表示学习中的优势。我们的代码开源在https://github.com/anzhang314/InvCF。

相关内容

协同过滤

关注 224

协同过滤（英语：Collaborative Filtering），简单来说是利用某兴趣相投、拥有共同经验之群体的喜好来推荐用户感兴趣的信息，个人透过合作的机制给予信息相当程度的回应（如评分）并记录下来以达到过滤的目的进而帮助别人筛选信息，回应不一定局限于特别感兴趣的，特别不感兴趣信息的纪录也相当重要。协同过滤又可分为评比（rating）或者群体过滤（social filtering）。其后成为电子商务当中很重要的一环，即根据某顾客以往的购买行为以及从具有相似购买行为的顾客群的购买行为去推荐这个顾客其“可能喜欢的品项”，也就是借由社群的喜好提供个人化的信息、商品等的推荐服务。除了推荐之外，近年来也发展出数学运算让系统自动计算喜好的强弱进而去芜存菁使得过滤的内容更有依据，也许不是百分之百完全准确，但由于加入了强弱的评比让这个概念的应用更为广泛，除了电子商务之外尚有信息检索领域、网络个人影音柜、个人书架等的应用等。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

专知会员服务

60+阅读 · 2020年7月6日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日