The use of pre-training is an emerging technique to enhance a neural model's performance, which has been shown to be effective for many neural language models such as BERT. This technique has also been used to enhance the performance of recommender systems. In such recommender systems, pre-training models are used to learn a better initialisation for both users and items. However, recent existing pre-trained recommender systems tend to only incorporate the user interaction data at the pre-training stage, making it difficult to deliver good recommendations, especially when the interaction data is sparse. To alleviate this common data sparsity issue, we propose to pre-train the recommendation model not only with the interaction data but also with other available information such as the social relations among users, thereby providing the recommender system with a better initialisation compared with solely relying on the user interaction data. We propose a novel recommendation model, the Social-aware Gaussian Pre-trained model (SGP), which encodes the user social relations and interaction data at the pre-training stage in a Graph Neural Network (GNN). Afterwards, in the subsequent fine-tuning stage, our SGP model adopts a Gaussian Mixture Model (GMM) to factorise these pre-trained embeddings for further training, thereby benefiting the cold-start users from these pre-built social relations. Our extensive experiments on three public datasets show that, in comparison to 16 competitive baselines, our SGP model significantly outperforms the best baseline by upto 7.7% in terms of NDCG@10. In addition, we show that SGP permits to effectively alleviate the cold-start problem, especially when users newly register to the system through their friends' suggestions.
翻译:预训练是一种新兴技术,用于增强神经模型的性能,已被证明对BERT等神经语言模型有效。该技术也被用于提升推荐系统的性能。在这类推荐系统中,预训练模型用于学习用户和项目更好的初始化表示。然而,现有基于预训练的推荐系统通常在预训练阶段仅整合用户交互数据,导致在交互数据稀疏时难以提供优质推荐。为缓解这一常见的数据稀疏问题,我们提出不仅利用交互数据,还结合用户社交关系等其他可用信息对推荐模型进行预训练,从而为推荐系统提供比仅依赖用户交互数据更好的初始化。我们提出了一种新颖的推荐模型——社交感知高斯预训练模型(SGP),该模型在预训练阶段通过图神经网络(GNN)编码用户社交关系与交互数据。随后在微调阶段,我们的SGP模型采用高斯混合模型(GMM)对这些预训练嵌入进行因子分解以进一步训练,从而利用预构建的社交关系惠及冷启动用户。在三个公开数据集上的大量实验表明,与16个强基线相比,我们的SGP模型在NDCG@10指标上最高可超越最佳基线7.7%。此外,我们证明SGP能有效缓解冷启动问题,尤其当用户通过好友推荐新注册系统时效果显著。