Federated learning (FL) is a distributed machine learning (ML) paradigm that enables clients to collaborate without accessing, infringing upon, or leaking original user data by sharing only model parameters. In the Internet of Things (IoT), edge devices are increasingly leveraging multimodal data compositions and fusion paradigms to enhance model performance. However, in FL applications, two main challenges remain open: (i) addressing the issues caused by heterogeneous clients lacking specific modalities and (ii) devising an optimal modality upload strategy to minimize communication overhead while maximizing learning performance. In this paper, we propose Federated Multimodal Fusion learning with Selective modality communication (FedMFS), a new multimodal fusion FL methodology that can tackle the above mentioned challenges. The key idea is to utilize Shapley values to quantify each modality's contribution and modality model size to gauge communication overhead, so that each client can selectively upload the modality models to the server for aggregation. This enables FedMFS to flexibly balance performance against communication costs, depending on resource constraints and applications. Experiments on real-world multimodal datasets demonstrate the effectiveness of FedMFS, achieving comparable accuracy while reducing communication overhead by one twentieth compared to baselines.
翻译:联邦学习(FL)是一种分布式机器学习(ML)范式,允许客户端在不访问、侵犯或泄露原始用户数据的情况下,仅通过共享模型参数进行协作。在物联网(IoT)中,边缘设备越来越多地利用多模态数据组合与融合范式来提升模型性能。然而,在联邦学习应用中仍存在两大挑战:(i) 解决因客户端缺乏特定模态导致的异构性问题;(ii) 设计最优的模态上传策略,以最小化通信开销同时最大化学习性能。本文提出了一种名为FedMFS(面向选择性模态通信的联邦多模态融合学习)的新型多模态融合联邦学习方法,可应对上述挑战。其核心思想是利用沙普利值量化各模态的贡献,同时结合模态模型大小评估通信开销,从而使每个客户端能够选择性地将模态模型上传至服务器进行聚合。这使得FedMFS能够根据资源约束和应用场景灵活平衡性能与通信成本。在真实多模态数据集上的实验证明了FedMFS的有效性,在达到相近准确率的同时,通信开销较基线方法降低至二十分之一。