Federated Learning (FL) is a method for training machine learning models using distributed data sources. It ensures privacy by allowing clients to collaboratively learn a shared global model while storing their data locally. However, a significant challenge arises when dealing with missing modalities in clients' datasets, where certain features or modalities are unavailable or incomplete, leading to heterogeneous data distribution. While previous studies have addressed the issue of complete-modality missing, they fail to tackle partial-modality missing on account of severe heterogeneity among clients at an instance level, where the pattern of missing data can vary significantly from one sample to another. To tackle this challenge, this study proposes a novel framework named FedMAC, designed to address multi-modality missing under conditions of partial-modality missing in FL. Additionally, to avoid trivial aggregation of multi-modal features, we introduce contrastive-based regularization to impose additional constraints on the latent representation space. The experimental results demonstrate the effectiveness of FedMAC across various client configurations with statistical heterogeneity, outperforming baseline methods by up to 26% in severe missing scenarios, highlighting its potential as a solution for the challenge of partially missing modalities in federated systems.
翻译:联邦学习(Federated Learning, FL)是一种利用分布式数据源训练机器学习模型的方法。该方法通过让客户端在本地存储数据的同时协作学习一个共享的全局模型,从而确保隐私性。然而,当处理客户端数据集中存在的模态缺失问题时,即某些特征或模态不可用或不完整,会导致数据分布异质性,这构成了一个重大挑战。尽管先前的研究已解决了完全模态缺失的问题,但由于客户端在实例层面存在严重的异质性——缺失数据的模式在不同样本间可能存在显著差异——这些研究未能应对部分模态缺失的挑战。为解决这一难题,本研究提出了一种名为FedMAC的新颖框架,旨在解决联邦学习中部分模态缺失条件下的多模态缺失问题。此外,为避免多模态特征的平凡聚合,我们引入了基于对比的正则化方法,对潜在表示空间施加额外的约束。实验结果表明,FedMAC在具有统计异质性的多种客户端配置下均表现出有效性,在严重缺失场景中其性能优于基线方法高达26%,凸显了其作为解决联邦系统中部分模态缺失挑战的潜在解决方案。