Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is gaining traction in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on enhancing their performance by leveraging the differences or shared features from various modalities and fusing feature across different modalities. These approaches are generally not optimal for clinical settings, which pose the additional challenges of limited training data, as well as being rife with redundant data or noisy modality channels, leading to subpar performance. To address this gap, we study the robustness of existing methods to data redundancy and noise and propose a generalized dynamic multimodal information bottleneck framework for attaining a robust fused feature representation. Specifically, our information bottleneck module serves to filter out the task-irrelevant information and noises in the fused feature, and we further introduce a sufficiency loss to prevent dropping of task-relevant information, thus explicitly preserving the sufficiency of prediction information in the distilled feature. We validate our model on an in-house and a public COVID19 dataset for mortality prediction as well as two public biomedical datasets for diagnostic tasks. Extensive experiments show that our method surpasses the state-of-the-art and is significantly more robust, being the only method to remain performance when large-scale noisy channels exist. Our code is publicly available at https://github.com/BII-wushuang/DMIB.
翻译:有效利用多模态数据(如各类影像、实验室检测结果及临床信息)在基于人工智能的医学诊断与预后任务中日益受到关注。现有大多数多模态技术仅专注于通过利用不同模态间的差异性或共享特征、跨模态特征融合来提升性能。这些方法通常不适用于临床场景,因为临床场景面临训练数据有限、冗余数据或噪声模态通道普遍存在等额外挑战,导致性能欠佳。为弥补这一不足,我们研究了现有方法对数据冗余和噪声的鲁棒性,并提出了一种广义动态多模态信息瓶颈框架,以获得稳健的融合特征表示。具体而言,我们的信息瓶颈模块用于过滤融合特征中的任务无关信息与噪声,同时引入充分性损失以防止任务相关信息的丢弃,从而在蒸馏特征中显式保留预测信息的充分性。我们在内部及公开的COVID-19死亡率预测数据集以及两个公开生物医学诊断任务数据集上验证了模型性能。大量实验表明,我们的方法超越了现有最优方法,且鲁棒性显著更强——当存在大规模噪声通道时,该方法仍是唯一能保持性能的技术。我们的代码已在https://github.com/BII-wushuang/DMIB公开。