Federated Recommendation (FR) is a new learning paradigm to tackle the learn-to-rank problem in a privacy-preservation manner. How to integrate multi-modality features into federated recommendation is still an open challenge in terms of efficiency, distribution heterogeneity, and fine-grained alignment. To address these challenges, we propose a novel multimodal fusion mechanism in federated recommendation settings (GFMFR). Specifically, it offloads multimodal representation learning to the server, which stores item content and employs a high-capacity encoder to generate expressive representations, alleviating client-side overhead. Moreover, a group-aware item representation fusion approach enables fine-grained knowledge sharing among similar users while retaining individual preferences. The proposed fusion loss could be simply plugged into any existing federated recommender systems empowering their capability by adding multi-modality features. Extensive experiments on five public benchmark datasets demonstrate that GFMFR consistently outperforms state-of-the-art multimodal FR baselines.
翻译:联邦推荐是一种以隐私保护方式解决学习排序问题的新型学习范式。如何在联邦推荐中有效整合多模态特征,在效率、分布异构性和细粒度对齐方面仍存在开放挑战。为应对这些挑战,我们提出一种新颖的联邦推荐场景多模态融合机制。该机制将多模态表征学习卸载至服务器端,服务器存储项目内容并采用高容量编码器生成表达性表征,从而减轻客户端开销。此外,基于分组的项目表征融合方法能够在相似用户间实现细粒度知识共享,同时保留个体偏好。所提出的融合损失函数可轻松嵌入现有联邦推荐系统,通过添加多模态特征增强其能力。在五个公开基准数据集上的大量实验表明,该机制在多模态联邦推荐基准方法中持续取得更优性能。