Recommender systems usually rely on observed user interaction data to build personalized recommendation models, assuming that the observed data reflect user interest. However, user interacting with an item may also due to conformity, the need to follow popular items. Most previous studies neglect user's conformity and entangle interest with it, which may cause the recommender systems fail to provide satisfying results. Therefore, from the cause-effect view, disentangling these interaction causes is a crucial issue. It also contributes to OOD problems, where training and test data are out-of-distribution. Nevertheless, it is quite challenging as we lack the signal to differentiate interest and conformity. The data sparsity of pure cause and the items' long-tail problem hinder disentangled causal embedding. In this paper, we propose DCCL, a framework that adopts contrastive learning to disentangle these two causes by sample augmentation for interest and conformity respectively. Futhermore, DCCL is model-agnostic, which can be easily deployed in any industrial online system. Extensive experiments are conducted over two real-world datasets and DCCL outperforms state-of-the-art baselines on top of various backbone models in various OOD environments. We also demonstrate the performance improvements by online A/B testing on Kuaishou, a billion-user scale short-video recommender system.
翻译:推荐系统通常依赖观察到的用户交互数据来构建个性化推荐模型,假设这些数据反映用户兴趣。然而,用户与物品交互也可能出于从众心理,即跟随热门物品的需求。以往多数研究忽视了用户的从众行为,并将其与兴趣纠缠在一起,这可能导致推荐系统无法提供令人满意的结果。因此,从因果关系视角解耦这些交互成因是一个关键问题。这也有助于解决训练数据和测试数据分布不一致的OOD问题。然而,由于缺乏区分兴趣和从众的信号,这一任务颇具挑战性。纯因数据的稀疏性以及物品的长尾问题阻碍了解耦因果嵌入的实现。本文提出DCCL框架,采用对比学习通过分别为兴趣和从众进行样本增强来解耦这两种成因。此外,DCCL是模型无关的,可轻松部署于任何工业在线系统。我们在两个真实数据集上进行了大量实验,结果表明DCCL在各种OOD环境下,基于不同骨干模型均优于当前最先进的基线模型。我们还在快手(一个拥有十亿用户规模的短视频推荐系统)上通过在线A/B测试验证了性能提升。