In recent years, multimodal recommendation has received significant attention and achieved remarkable success in GCN-based recommendation methods. However, there are two key challenges here: (1) There is a significant amount of redundant information in multimodal features that is unrelated to user preferences. Directly injecting multimodal features into the interaction graph can affect the collaborative feature learning between users and items. (2) There are false negative and false positive behaviors caused by system errors such as accidental clicks and non-exposure. This feedback bias can affect the ranking accuracy of training sample pairs, thereby reducing the recommendation accuracy of the model. To address these challenges, this work proposes a Joint Behavior-guided and Modal-consistent Conditional Graph Diffusion Model (JBM-Diff) for joint denoising of multimodal features and user feedback. We design a diffusion model conditioned on collaborative features for each modal feature to remove preference-irrelevant information, and enhance the alignment between collaborative features and modal semantic information through multi-view message propagation and feature fusion. Finally, we detect the partial order consistency of sample pairs from a behavioral perspective based on learned modal preferences, set the credibility for sample pairs, and achieve data augmentation. Extensive experiments on three public datasets demonstrate the effectiveness of this work. Codes are available at https://github.com/pxcstart/JBMDiff.
翻译:近年来,多模态推荐在基于图卷积网络(GCN)的方法中获得了广泛关注并取得了显著成功。然而,当前研究仍面临两个关键挑战:(1)多模态特征中存在大量与用户偏好无关的冗余信息,直接将其注入交互图会影响用户与物品间的协同特征学习;(2)由意外点击、非曝光等系统误差导致的假负例和假正例行为,这种反馈偏差会影响训练样本对的排序精度,从而降低模型推荐准确率。针对上述挑战,本文提出了一种联合行为引导与模态一致性条件图扩散模型(JBM-Diff),用于对多模态特征和用户反馈进行联合去噪。我们为每种模态特征设计了一个以协同特征为条件的扩散模型,以去除偏好无关信息,并通过多视角消息传播与特征融合增强协同特征与模态语义信息的一致性。最后,基于学习到的模态偏好,从行为角度检测样本对的偏序一致性,为样本对设置可信度并实现数据增强。在三个公开数据集上的大量实验验证了该方法的有效性。代码开源地址:https://github.com/pxcstart/JBMDiff。