The existing collaborative recommendation models that use multi-modal information emphasize the representation of users' preferences but easily ignore the representation of users' dislikes. Nevertheless, modelling users' dislikes facilitates comprehensively characterizing user profiles. Thus, the representation of users' dislikes should be integrated into the user modelling when we construct a collaborative recommendation model. In this paper, we propose a novel Collaborative Recommendation Model based on Multi-modal multi-view Attention Network (CRMMAN), in which the users are represented from both preference and dislike views. Specifically, the users' historical interactions are divided into positive and negative interactions, used to model the user's preference and dislike views, respectively. Furthermore, the semantic and structural information extracted from the scene is employed to enrich the item representation. We validate CRMMAN by designing contrast experiments based on two benchmark MovieLens-1M and Book-Crossing datasets. Movielens-1m has about a million ratings, and Book-Crossing has about 300,000 ratings. Compared with the state-of-the-art knowledge-graph-based and multi-modal recommendation methods, the AUC, NDCG@5 and NDCG@10 are improved by 2.08%, 2.20% and 2.26% on average of two datasets. We also conduct controlled experiments to explore the effects of multi-modal information and multi-view mechanism. The experimental results show that both of them enhance the model's performance.
翻译:现有利用多模态信息的协同推荐模型侧重于表征用户偏好,却易忽略用户不喜欢的表征。然而,对用户不喜欢的建模有助于全面刻画用户画像。因此,在构建协同推荐模型时,应将用户不喜欢的表征整合到用户建模中。本文提出一种基于多模态多视角注意力网络的新型协同推荐模型(CRMMAN),该模型从偏好与不喜欢两个视角对用户进行表征。具体而言,用户的历史交互被分为正向与负向交互,分别用于建模用户的偏好视角与不喜欢视角。此外,从场景中提取的语义与结构信息被用于丰富项目表征。我们通过在两个基准数据集MovieLens-1M与Book-Crossing上设计对比实验来验证CRMMAN。MovieLens-1M包含约百万条评分,Book-Crossing包含约30万条评分。与基于知识图谱与多模态推荐的最先进方法相比,两个数据集上的AUC、NDCG@5和NDCG@10平均提升2.08%、2.20%和2.26%。我们还开展控制实验以探究多模态信息与多视角机制的影响。实验结果表明,两者均能增强模型性能。