In recent years, neural architecture-based recommender systems have achieved tremendous success, but they still fall short of expectation when dealing with highly sparse data. Self-supervised learning (SSL), as an emerging technique for learning from unlabeled data, has attracted considerable attention as a potential solution to this issue. This survey paper presents a systematic and timely review of research efforts on self-supervised recommendation (SSR). Specifically, we propose an exclusive definition of SSR, on top of which we develop a comprehensive taxonomy to divide existing SSR methods into four categories: contrastive, generative, predictive, and hybrid. For each category, we elucidate its concept and formulation, the involved methods, as well as its pros and cons. Furthermore, to facilitate empirical comparison, we release an open-source library SELFRec (https://github.com/Coder-Yu/SELFRec), which incorporates a wide range of SSR models and benchmark datasets. Through rigorous experiments using this library, we derive and report some significant findings regarding the selection of self-supervised signals for enhancing recommendation. Finally, we shed light on the limitations in the current research and outline the future research directions.
翻译:近年来,基于神经架构的推荐系统取得了巨大成功,但在处理高度稀疏数据时仍未能达到预期。自监督学习(SSL)作为一种从无标签数据中学习的新兴技术,作为解决这一问题的潜在方案受到了广泛关注。本综述论文对自监督推荐(SSR)的研究工作进行了系统且及时的回顾。具体而言,我们提出了SSR的独有定义,并在此基础上构建了一个全面的分类体系,将现有SSR方法分为四类:对比式、生成式、预测式和混合式。对于每一类,我们阐述了其概念与形式化、所涉及的方法,以及优缺点。此外,为便于实证比较,我们发布了一个开源库SELFRec(https://github.com/Coder-Yu/SELFRec),其中包含广泛的SSR模型和基准数据集。通过使用该库进行严格实验,我们得出并报告了关于选择自监督信号以增强推荐的一些重要发现。最后,我们揭示了当前研究的局限性,并勾勒了未来的研究方向。