The recommender system (RS) has been an integral toolkit of online services. They are equipped with various deep learning techniques to model user preference based on identifier and attribute information. With the emergence of multimedia services, such as short video, news and etc., understanding these contents while recommending becomes critical. Besides, multimodal features are also helpful in alleviating the problem of data sparsity in RS. Thus, Multimodal Recommender System (MRS) has attracted much attention from both academia and industry recently. In this paper, we will give a comprehensive survey of the MRS models, mainly from technical views. First, we conclude the general procedures and major challenges for MRS. Then, we introduce the existing MRS models according to three categories, i.e., Feature Interaction, Feature Enhancement and Model Optimization. To make it convenient for those who want to research this field, we also summarize the dataset and code resources. Finally, we discuss some promising future directions of MRS and conclude this paper.
翻译:推荐系统(RS)已成为在线服务不可或缺的工具集。它基于标识符与属性信息,借助多种深度学习技术来建模用户偏好。随着短视频、新闻等多媒体服务的兴起,在推荐过程中理解这些内容变得至关重要。此外,多模态特征有助于缓解推荐系统中数据稀疏性问题。因此,多模态推荐系统(MRS)近年来引起了学术界和工业界的广泛关注。本文将从技术视角对MRS模型进行全面综述。首先,我们归纳了MRS的一般流程与主要挑战。随后,根据特征交互、特征增强与模型优化三类方法,介绍现有MRS模型。为方便该领域研究者,我们还总结了相关数据集与代码资源。最后,探讨了MRS未来若干有前景的研究方向并对本文进行总结。