For predicting cancer survival outcomes, standard approaches in clinical research are often based on two main modalities: pathology images for observing cell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene expressions. However, existing pathology-genomic multi-modal algorithms face significant challenges: (1) Valuable biological insights regarding genes and gene-gene interactions are frequently overlooked; (2) one modality often dominates the optimization process, causing inadequate training for the other modality. In this paper, we introduce a new multi-modal ``Path-GPTOmic" framework for cancer survival outcome prediction. First, to extract valuable biological insights, we regulate the embedding space of a foundation model, scGPT, initially trained on single-cell RNA-seq data, making it adaptable for bulk RNA-seq data. Second, to address the imbalance-between-modalities problem, we propose a gradient modulation mechanism tailored to the Cox partial likelihood loss for survival prediction. The contributions of the modalities are dynamically monitored and adjusted during the training process, encouraging that both modalities are sufficiently trained. Evaluated on two TCGA(The Cancer Genome Atlas) datasets, our model achieves substantially improved survival prediction accuracy.
翻译:在癌症生存结果预测中,临床研究的标准方法通常基于两种主要模态:用于观察细胞形态特征的病理图像,以及用于量化基因表达的基因组学(如批量RNA测序)数据。然而,现有的病理-基因组多模态算法面临显著挑战:(1)关于基因及基因间相互作用的宝贵生物学见解常被忽视;(2)一种模态常主导优化过程,导致另一模态训练不充分。本文提出一种新的多模态“Path-GPTOmic”框架用于癌症生存结果预测。首先,为提取有价值的生物学见解,我们调控了预训练于单细胞RNA测序数据的基础模型scGPT的嵌入空间,使其适应批量RNA测序数据。其次,为解决模态间不平衡问题,我们提出了一种针对生存预测Cox部分似然损失的梯度调制机制。训练过程中动态监测并调整模态贡献,确保两种模态均得到充分训练。在TCGA(癌症基因组图谱)两个数据集上的评估表明,该模型显著提升了生存预测准确性。