Histo-genomic multi-modal methods have recently emerged as a powerful paradigm, demonstrating significant potential for improving cancer prognosis. However, genome sequencing, unlike histopathology imaging, is still not widely accessible in underdeveloped regions, limiting the application of these multi-modal approaches in clinical settings. To address this, we propose a novel Genome-informed Hyper-Attention Network, termed G-HANet, which is capable of effectively distilling the histo-genomic knowledge during training to elevate uni-modal whole slide image (WSI)-based inference for the first time. Compared with traditional knowledge distillation methods (i.e., teacher-student architecture) in other tasks, our end-to-end model is superior in terms of training efficiency and learning cross-modal interactions. Specifically, the network comprises the cross-modal associating branch (CAB) and hyper-attention survival branch (HSB). Through the genomic data reconstruction from WSIs, CAB effectively distills the associations between functional genotypes and morphological phenotypes and offers insights into the gene expression profiles in the feature space. Subsequently, HSB leverages the distilled histo-genomic associations as well as the generated morphology-based weights to achieve the hyper-attention modeling of the patients from both histopathology and genomic perspectives to improve cancer prognosis. Extensive experiments are conducted on five TCGA benchmarking datasets and the results demonstrate that G-HANet significantly outperforms the state-of-the-art WSI-based methods and achieves competitive performance with genome-based and multi-modal methods. G-HANet is expected to be explored as a useful tool by the research community to address the current bottleneck of insufficient histo-genomic data pairing in the context of cancer prognosis and precision oncology.
翻译:摘要:组织-基因组多模态方法最近成为一种强大范式,展现出显著提升癌症预后评估的潜力。然而,与组织病理学成像不同,基因组测序在欠发达地区仍难以广泛获取,限制了这些多模态方法在临床环境中的应用。为解决这一问题,我们提出一种新型的基因组引导超注意力网络(G-HANet),该网络能够在训练过程中有效蒸馏组织-基因组知识,首次实现基于单模态全切片图像(WSI)的推理增强。与传统知识蒸馏方法(如教师-学生架构)相比,我们的端到端模型在训练效率与跨模态交互学习方面具有显著优势。具体而言,该网络由跨模态关联分支(CAB)与超注意力生存分支(HSB)组成。通过从WSI中重构基因组数据,CAB有效蒸馏了功能基因型与形态表型之间的关联,并在特征空间中揭示基因表达谱的内在规律。随后,HSB利用蒸馏后的组织-基因组关联以及生成的形态学权重,从组织病理学与基因组学双重视角对患者进行超注意力建模,从而改进癌症预后评估。我们在五个TCGA基准数据集上进行了广泛实验,结果表明G-HANet显著优于当前最先进的WSI方法,且性能与基于基因组或多模态的方法相当。该工具有望被研究社区用于解决当前癌症预后与精准肿瘤学领域中组织-基因组数据配对不足的瓶颈问题。