Multimodal deep learning (MDL) has achieved remarkable success across various domains, yet its practical deployment is often hindered by incomplete multimodal data. Existing incomplete MDL methods either discard missing modalities, risking the loss of valuable task-relevant information, or recover them, potentially introducing irrelevant noise, leading to the discarding-imputation dilemma. To address this dilemma, in this paper, we propose DyMo, a new inference-time dynamic modality selection framework that adaptively identifies and integrates reliable recovered modalities, fully exploring task-relevant information beyond the conventional discard-or-impute paradigm. Central to DyMo is a novel selection algorithm that maximizes multimodal task-relevant information for each test sample. Since direct estimation of such information at test time is intractable due to the unknown data distribution, we theoretically establish a connection between information and the task loss, which we compute at inference time as a tractable proxy. Building on this, a novel principled reward function is proposed to guide modality selection. In addition, we design a flexible multimodal network architecture compatible with arbitrary modality combinations, alongside a tailored training strategy for robust representation learning. Extensive experiments on diverse natural and medical image datasets show that DyMo significantly outperforms state-of-the-art incomplete/dynamic MDL methods across various missing-data scenarios. Our code is available at https://github.com//siyi-wind/DyMo.
翻译:多模态深度学习(MDL)已在多个领域取得显著成功,但其实际部署常受不完整多模态数据所限。现有不完整MDL方法要么丢弃缺失模态(可能损失有价值的任务相关信息),要么重建缺失模态(可能引入无关噪声),形成丢弃-重建的两难困境。为应对此困境,本文提出DyMo——一种新型推理时动态模态选择框架,能够自适应识别并整合可靠的重建模态,充分探索超越传统丢弃或重建范式之外的任务相关信息。DyMo的核心是一种新颖的选择算法,旨在为每个测试样本最大化多模态任务相关信息。由于测试时未知数据分布导致直接估计此类信息不可行,我们从理论上建立了信息与任务损失之间的关联,并在推理时将其作为可计算的代理指标。基于此,我们提出了一种具有理论依据的新型奖励函数来指导模态选择。此外,我们设计了一种兼容任意模态组合的灵活多模态网络架构,并制定了针对性的训练策略以实现鲁棒表征学习。在多种自然图像与医学影像数据集上的大量实验表明,DyMo在不同缺失数据场景下均显著优于当前最先进的不完整/动态MDL方法。代码已开源:https://github.com//siyi-wind/DyMo。