The rise of online multi-modal sharing platforms like TikTok and YouTube has enabled personalized recommender systems to incorporate multiple modalities (such as visual, textual, and acoustic) into user representations. However, addressing the challenge of data sparsity in these systems remains a key issue. To address this limitation, recent research has introduced self-supervised learning techniques to enhance recommender systems. However, these methods often rely on simplistic random augmentation or intuitive cross-view information, which can introduce irrelevant noise and fail to accurately align the multi-modal context with user-item interaction modeling. To fill this research gap, we propose a novel multi-modal graph diffusion model for recommendation called DiffMM. Our framework integrates a modality-aware graph diffusion model with a cross-modal contrastive learning paradigm to improve modality-aware user representation learning. This integration facilitates better alignment between multi-modal feature information and collaborative relation modeling. Our approach leverages diffusion models' generative capabilities to automatically generate a user-item graph that is aware of different modalities, facilitating the incorporation of useful multi-modal knowledge in modeling user-item interactions. We conduct extensive experiments on three public datasets, consistently demonstrating the superiority of our DiffMM over various competitive baselines. For open-sourced model implementation details, you can access the source codes of our proposed framework at: https://github.com/HKUDS/DiffMM .
翻译:随着TikTok和YouTube等在线多模态共享平台的兴起,个性化推荐系统得以将多种模态(如视觉、文本和声学)融入用户表征。然而,解决这些系统中的数据稀疏性挑战仍然是一个关键问题。为应对这一局限,近期研究引入了自监督学习技术以增强推荐系统。但这些方法通常依赖于简单的随机增强或直观的跨视图信息,可能引入无关噪声且难以准确对齐多模态上下文与用户-物品交互建模。为填补这一研究空白,我们提出了一种名为DiffMM的新型多模态图扩散推荐模型。该框架将模态感知图扩散模型与跨模态对比学习范式相结合,以改进模态感知的用户表征学习。这种整合促进了多模态特征信息与协同关系建模之间更好的对齐。我们的方法利用扩散模型的生成能力,自动生成感知不同模态的用户-物品图,从而促进在用户-物品交互建模中融入有用的多模态知识。我们在三个公开数据集上进行了大量实验,结果一致证明了DiffMM相对于多种竞争基线的优越性。关于开源模型实现细节,您可通过以下链接访问我们提出框架的源代码:https://github.com/HKUDS/DiffMM。