Given the large volume of side information from different modalities, multimodal recommender systems have become increasingly vital, as they exploit richer semantic information beyond user-item interactions. Recent works highlight that leveraging Graph Convolutional Networks (GCNs) to explicitly model multimodal item-item relations can significantly enhance recommendation performance. However, due to the inherent over-smoothing issue of GCNs, existing models benefit only from shallow GCNs with limited representation power. This drawback is especially pronounced when facing complex and high-dimensional patterns such as multimodal data, as it requires large-capacity models to accommodate complicated correlations. To this end, in this paper, we investigate bypassing GCNs when modeling multimodal item-item relationship. More specifically, we propose a Topology-aware Multi-Layer Perceptron (TMLP), which uses MLPs instead of GCNs to model the relationships between items. TMLP enhances MLPs with topological pruning to denoise item-item relations and intra (inter)-modality learning to integrate higher-order modality correlations. Extensive experiments on three real-world datasets verify TMLP's superiority over nine baselines. We also find that by discarding the internal message passing in GCNs, which is sensitive to node connections, TMLP achieves significant improvements in both training efficiency and robustness against existing models.
翻译:鉴于来自不同模态的辅助信息量庞大,多模态推荐系统变得日益重要,因为它们利用了超越用户-物品交互的更丰富语义信息。近期研究强调,利用图卷积网络显式建模多模态物品间关系可显著提升推荐性能。然而,由于图卷积网络固有的过度平滑问题,现有模型仅能从表征能力有限的浅层图卷积网络中获益。当面对如多模态数据这类复杂高维模式时,此缺陷尤为突出,因为处理复杂相关性需要大容量模型。为此,本文研究在建模多模态物品关系时绕过图卷积网络的方法。具体而言,我们提出一种拓扑感知多层感知机模型,该模型使用多层感知机替代图卷积网络来建模物品间关系。该模型通过拓扑剪枝增强多层感知机以消除物品关系噪声,并通过模态内(间)学习整合高阶模态相关性。在三个真实数据集上的大量实验验证了该模型相对于九个基线模型的优越性。我们还发现,通过摒弃图卷积网络中对节点连接敏感的内部消息传递机制,该模型在训练效率和对抗现有模型的鲁棒性方面均实现了显著提升。