Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts

Multimodal learning has gained increasing importance across various fields, offering the ability to integrate data from diverse sources such as images, text, and personalized records, which are frequently observed in medical domains. However, in scenarios where some modalities are missing, many existing frameworks struggle to accommodate arbitrary modality combinations, often relying heavily on a single modality or complete data. This oversight of potential modality combinations limits their applicability in real-world situations. To address this challenge, we propose Flex-MoE (Flexible Mixture-of-Experts), a new framework designed to flexibly incorporate arbitrary modality combinations while maintaining robustness to missing data. The core idea of Flex-MoE is to first address missing modalities using a new missing modality bank that integrates observed modality combinations with the corresponding missing ones. This is followed by a uniquely designed Sparse MoE framework. Specifically, Flex-MoE first trains experts using samples with all modalities to inject generalized knowledge through the generalized router ($\mathcal{G}$-Router). The $\mathcal{S}$-Router then specializes in handling fewer modality combinations by assigning the top-1 gate to the expert corresponding to the observed modality combination. We evaluate Flex-MoE on the ADNI dataset, which encompasses four modalities in the Alzheimer's Disease domain, as well as on the MIMIC-IV dataset. The results demonstrate the effectiveness of Flex-MoE highlighting its ability to model arbitrary modality combinations in diverse missing modality scenarios. Code is available at https://github.com/UNITES-Lab/flex-moe.

翻译：多模态学习在多个领域日益重要，它能够整合来自不同来源的数据，例如图像、文本和个性化记录，这些数据在医学领域中经常出现。然而，当某些模态缺失时，许多现有框架难以适应任意的模态组合，通常严重依赖单一模态或完整数据。这种对潜在模态组合的忽视限制了它们在现实场景中的适用性。为应对这一挑战，我们提出了Flex-MoE（灵活专家混合），这是一个旨在灵活整合任意模态组合同时保持对缺失数据鲁棒性的新框架。Flex-MoE的核心思想是首先通过一个新的缺失模态库来处理缺失模态，该库将观测到的模态组合与相应的缺失模态相结合。随后是一个独特设计的稀疏MoE框架。具体而言，Flex-MoE首先使用所有模态的样本训练专家，通过广义路由器（$\mathcal{G}$-Router）注入泛化知识。接着，$\mathcal{S}$-Router通过将top-1门分配给与观测到的模态组合相对应的专家，专门处理较少的模态组合。我们在包含阿尔茨海默病领域四种模态的ADNI数据集以及MIMIC-IV数据集上评估了Flex-MoE。结果证明了Flex-MoE的有效性，突显了其在各种缺失模态场景下建模任意模态组合的能力。代码可在https://github.com/UNITES-Lab/flex-moe获取。