Federated learning (FL) underpins advancements in privacy-preserving distributed computing by collaboratively training neural networks without exposing clients' raw data. Current FL paradigms primarily focus on uni-modal data, while exploiting the knowledge from distributed multimodal data remains largely unexplored. Existing multimodal FL (MFL) solutions are mainly designed for statistical or modality heterogeneity from the input side, however, have yet to solve the fundamental issue,"modality imbalance", in distributed conditions, which can lead to inadequate information exploitation and heterogeneous knowledge aggregation on different modalities.In this paper, we propose a novel Cross-Modal Infiltration Federated Learning (FedCMI) framework that effectively alleviates modality imbalance and knowledge heterogeneity via knowledge transfer from the global dominant modality. To avoid the loss of information in the weak modality due to merely imitating the behavior of dominant modality, we design the two-projector module to integrate the knowledge from dominant modality while still promoting the local feature exploitation of weak modality. In addition, we introduce a class-wise temperature adaptation scheme to achieve fair performance across different classes. Extensive experiments over popular datasets are conducted and give us a gratifying confirmation of the proposed framework for fully exploring the information of each modality in MFL.
翻译:联邦学习(FL)通过在协作训练神经网络时不暴露客户端的原始数据,为隐私保护的分布式计算奠定了基础。当前FL范式主要关注单模态数据,而利用分布式多模态数据的知识仍鲜有探索。现有的多模态联邦学习(MFL)解决方案主要针对输入侧的统计异质性或模态异质性而设计,但尚未解决分布式条件下的根本问题——“模态不平衡”,该问题可能导致信息利用不足及不同模态上的异质知识聚合。本文提出了一种新颖的跨模态渗透联邦学习(FedCMI)框架,通过从全局主导模态进行知识迁移,有效缓解了模态不平衡与知识异质性。为避免弱模态因单纯模仿主导模态行为而导致信息损失,我们设计了一个双投影器模块,在整合主导模态知识的同时促进弱模态的局部特征挖掘。此外,我们引入了一种类别自适应温度调整方案,以实现不同类别间的公平性能。在主流数据集上开展的大量实验令人满意地验证了所提框架在MFL中充分探索各模态信息的有效性。