The integration of DNA methylation data with a Whole Slide Image (WSI) offers significant potential for enhancing the diagnostic precision of central nervous system (CNS) tumor classification in neuropathology. While existing approaches typically integrate encoded omic data with histology at either an early or late fusion stage, the potential of reintroducing omic data through dual fusion remains unexplored. In this paper, we propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions, boosting performance through multimodal integration. In the early fusion stage, omic embeddings are projected onto WSI patches in latent-space, which generates embeddings that encapsulate per-patch molecular and morphological insights. This effectively incorporates omic information into the spatial representation of the WSI. These embeddings are then refined with a Multiple Instance Learning gated attention mechanism which attends to diagnostic patches. In the late fusion stage, we reintroduce the omic data by fusing it with slide-level omic-WSI embeddings using a Multimodal Outer Arithmetic Block (MOAB), which richly intermingles features from both modalities, capturing their correlations and complementarity. We demonstrate accurate CNS tumor subtyping across 20 fine-grained subtypes and validate our approach on benchmark datasets, achieving improved survival prediction on TCGA-BLCA and competitive performance on TCGA-BRCA compared to state-of-the-art methods. This dual fusion strategy enhances interpretability and classification performance, highlighting its potential for clinical diagnostics.
翻译:将DNA甲基化数据与全切片图像整合,为神经病理学中中枢神经系统肿瘤分类的诊断精度提升提供了重要潜力。现有方法通常在早期或晚期融合阶段将编码的组学数据与组织学数据整合,但通过双融合重新引入组学数据的潜力尚未被探索。本文提出在早期和晚期融合阶段使用组学嵌入,以捕捉从局部(区块级)到全局(切片级)交互的互补信息,通过多模态整合提升性能。在早期融合阶段,组学嵌入被投影到潜在空间中的WSI区块上,生成封装了每个区块分子和形态学洞察的嵌入。这有效地将组学信息纳入WSI的空间表示中。这些嵌入随后通过多实例学习门控注意力机制进行细化,该机制关注诊断性区块。在晚期融合阶段,我们通过多模态外算术块将组学数据与切片级组学-WSI嵌入融合,重新引入组学数据,该块丰富地混合了两种模态的特征,捕捉它们的相关性和互补性。我们在20个细粒度亚型中展示了准确的中枢神经系统肿瘤亚型分类,并在基准数据集上验证了我们的方法,与最先进方法相比,在TCGA-BLCA上实现了改进的生存预测,并在TCGA-BRCA上取得了竞争性性能。这种双融合策略增强了可解释性和分类性能,突显了其在临床诊断中的潜力。