Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is gaining traction in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on enhancing their performance by leveraging the differences or shared features from various modalities and fusing feature across different modalities. These approaches are generally not optimal for clinical settings, which pose the additional challenges of limited training data, as well as being rife with redundant data or noisy modality channels, leading to subpar performance. To address this gap, we study the robustness of existing methods to data redundancy and noise and propose a generalized dynamic multimodal information bottleneck framework for attaining a robust fused feature representation. Specifically, our information bottleneck module serves to filter out the task-irrelevant information and noises in the fused feature, and we further introduce a sufficiency loss to prevent dropping of task-relevant information, thus explicitly preserving the sufficiency of prediction information in the distilled feature. We validate our model on an in-house and a public COVID19 dataset for mortality prediction as well as two public biomedical datasets for diagnostic tasks. Extensive experiments show that our method surpasses the state-of-the-art and is significantly more robust, being the only method to remain performance when large-scale noisy channels exist. Our code is publicly available at https://github.com/BII-wushuang/DMIB.
翻译:有效利用多模态数据(如各类影像、实验室检验结果及临床信息)在多种基于人工智能的医学诊断与预后任务中日益受到重视。现有大多数多模态技术仅通过利用不同模态间的差异性或共享特征、并跨模态融合特征来提升性能。这类方法通常不适合临床环境,因为临床场景存在训练数据有限、冗余数据或含噪模态通道普遍等额外挑战,导致性能欠佳。为解决这一不足,我们研究了现有方法对数据冗余和噪声的鲁棒性,并提出了一种通用的动态多模态信息瓶颈框架,以获得鲁棒的融合特征表示。具体而言,我们的信息瓶颈模块用于滤除融合特征中的任务无关信息与噪声,同时引入充分性损失以防止任务相关信息丢失,从而在蒸馏特征中明确保留预测信息的充分性。我们在内部数据集和公共COVID-19数据集上进行死亡率预测,并在两个公共生物医学数据集上进行诊断任务验证。大量实验表明,我们的方法超越了现有最优技术,且鲁棒性显著更强,是唯一在大规模含噪通道存在时仍能保持性能的方法。我们的代码已公开于https://github.com/BII-wushuang/DMIB。