Multimodal learning significantly benefits cancer survival prediction, especially the integration of pathological images and genomic data. Despite advantages of multimodal learning for cancer survival prediction, massive redundancy in multimodal data prevents it from extracting discriminative and compact information: (1) An extensive amount of intra-modal task-unrelated information blurs discriminability, especially for gigapixel whole slide images (WSIs) with many patches in pathology and thousands of pathways in genomic data, leading to an ``intra-modal redundancy" issue. (2) Duplicated information among modalities dominates the representation of multimodal data, which makes modality-specific information prone to being ignored, resulting in an ``inter-modal redundancy" issue. To address these, we propose a new framework, Prototypical Information Bottlenecking and Disentangling (PIBD), consisting of Prototypical Information Bottleneck (PIB) module for intra-modal redundancy and Prototypical Information Disentanglement (PID) module for inter-modal redundancy. Specifically, a variant of information bottleneck, PIB, is proposed to model prototypes approximating a bunch of instances for different risk levels, which can be used for selection of discriminative instances within modality. PID module decouples entangled multimodal data into compact distinct components: modality-common and modality-specific knowledge, under the guidance of the joint prototypical distribution. Extensive experiments on five cancer benchmark datasets demonstrated our superiority over other methods.
翻译:多模态学习显著改善了癌症生存预测,尤其是病理图像与基因组数据的整合。尽管多模态学习在癌症生存预测中具有优势,但多模态数据中的大量冗余阻碍了模型提取判别性且紧凑的信息:(1) 模态内部存在大量与任务无关的信息模糊了判别性,尤其体现在包含众多病理图像块的全玻片图像(WSIs)及包含数千条通路的基因组数据中,导致“模态内冗余”问题;(2) 模态间的重复信息主导了多模态数据的表征,使得模态特异性信息容易被忽略,引发“模态间冗余”问题。为解决上述问题,我们提出新框架——原型信息瓶颈与解缠(PIBD),包含用于模态内冗余的原型信息瓶颈(PIB)模块和用于模态间冗余的原型信息解缠(PID)模块。具体而言,我们提出信息瓶颈变体PIB,通过建模不同风险等级实例的原型近似分布,实现模态内判别性实例的选择。PID模块则在联合原型分布引导下,将纠缠的多模态数据解耦为紧凑的独立成分:模态共有知识与模态特异性知识。在五个癌症基准数据集上的大量实验证明,本方法优于其他现有方法。