Invariant Collaborative Filtering to Popularity Distribution Shift

Collaborative Filtering (CF) models, despite their great success, suffer from severe performance drops due to popularity distribution shifts, where these changes are ubiquitous and inevitable in real-world scenarios. Unfortunately, most leading popularity debiasing strategies, rather than tackling the vulnerability of CF models to varying popularity distributions, require prior knowledge of the test distribution to identify the degree of bias and further learn the popularity-entangled representations to mitigate the bias. Consequently, these models result in significant performance benefits in the target test set, while dramatically deviating the recommendation from users' true interests without knowing the popularity distribution in advance. In this work, we propose a novel learning framework, Invariant Collaborative Filtering (InvCF), to discover disentangled representations that faithfully reveal the latent preference and popularity semantics without making any assumption about the popularity distribution. At its core is the distillation of unbiased preference representations (i.e., user preference on item property), which are invariant to the change of popularity semantics, while filtering out the popularity feature that is unstable or outdated. Extensive experiments on five benchmark datasets and four evaluation settings (i.e., synthetic long-tail, unbiased, temporal split, and out-of-distribution evaluations) demonstrate that InvCF outperforms the state-of-the-art baselines in terms of popularity generalization ability on real recommendations. Visualization studies shed light on the advantages of InvCF for disentangled representation learning. Our codes are available at https://github.com/anzhang314/InvCF.

翻译：协同过滤（CF）模型尽管取得了巨大成功，但由于流行度分布偏移（这些变化在现实场景中普遍存在且不可避免）而遭受严重的性能下降。遗憾的是，大多数主流的去流行度偏差策略并非直接应对CF模型在不同流行度分布下的脆弱性，而是需要测试分布的先验知识来识别偏差程度，并进一步学习与流行度纠缠的表示以缓解偏差。因此，这些模型在目标测试集上表现出显著的性能提升，但在未知流行度分布的情况下，推荐结果会严重偏离用户的真实兴趣。在本文中，我们提出了一种新颖的学习框架——不变协同过滤（InvCF），旨在无需对流行度分布做任何假设的情况下，发现能够忠实揭示潜在偏好和流行度语义的解耦表示。其核心是提炼对流行度语义变化保持不变的、无偏的偏好表示（即用户对物品属性的偏好），同时过滤掉不稳定或过时的流行度特征。在五个基准数据集和四种评估设置（即合成长尾、无偏、时间分割和分布外评估）上的大量实验表明，InvCF在真实推荐场景的流行度泛化能力上优于当前最先进的基线方法。可视化研究揭示了InvCF在解耦表示学习中的优势。我们的代码可在 https://github.com/anzhang314/InvCF 获取。

相关内容

协同过滤

关注 224

协同过滤（英语：Collaborative Filtering），简单来说是利用某兴趣相投、拥有共同经验之群体的喜好来推荐用户感兴趣的信息，个人透过合作的机制给予信息相当程度的回应（如评分）并记录下来以达到过滤的目的进而帮助别人筛选信息，回应不一定局限于特别感兴趣的，特别不感兴趣信息的纪录也相当重要。协同过滤又可分为评比（rating）或者群体过滤（social filtering）。其后成为电子商务当中很重要的一环，即根据某顾客以往的购买行为以及从具有相似购买行为的顾客群的购买行为去推荐这个顾客其“可能喜欢的品项”，也就是借由社群的喜好提供个人化的信息、商品等的推荐服务。除了推荐之外，近年来也发展出数学运算让系统自动计算喜好的强弱进而去芜存菁使得过滤的内容更有依据，也许不是百分之百完全准确，但由于加入了强弱的评比让这个概念的应用更为广泛，除了电子商务之外尚有信息检索领域、网络个人影音柜、个人书架等的应用等。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

47+阅读 · 2020年10月31日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日