Survival prediction is an important branch of cancer prognosis analysis. The model that predicts survival risk through TCGA genomics data can discover genes related to cancer and provide diagnosis and treatment recommendations based on patient characteristics. We found that deep learning models based on Cox proportional hazards often suffer from overfitting when dealing with high-throughput data. Moreover, we found that as the number of network layers increases, the experimental results will not get better, and network degradation will occur. Based on this problem, we propose a new framework based on Deep Residual Learning. Combine the ideas of Cox proportional hazards and Residual. And name it ResSurv. First, ResSurv is a feed-forward deep learning network stacked by multiple basic ResNet Blocks. In each ResNet Block, we add a Normalization Layer to prevent gradient disappearance and gradient explosion. Secondly, for the loss function of the neural network, we inherited the Cox proportional hazards methods, applied the semi-parametric of the CPH model to the neural network, combined with the partial likelihood model, established the loss function, and performed backpropagation and gradient update. Finally, we compared ResSurv networks of different depths and found that we can effectively extract high-dimensional features. Ablation experiments and comparative experiments prove that our model has reached SOTA(state of the art) in the field of deep learning, and our network can effectively extract deep information.
翻译:生存预测是癌症预后分析的重要分支。通过TCGA基因组数据预测生存风险的模型能够发现与癌症相关的基因,并根据患者特征提供诊疗建议。我们发现基于Cox比例风险模型的深度学习方法在处理高通量数据时容易出现过拟合现象。此外,随着网络层数的增加,实验结果并未提升,反而出现网络退化问题。针对这一问题,我们提出了一种基于深度残差学习的新框架,融合了Cox比例风险与残差思想,并命名为ResSurv。首先,ResSurv是由多个基础ResNet模块堆叠而成的前馈深度学习网络。每个ResNet模块中,我们加入归一化层以防止梯度消失和梯度爆炸。其次,针对神经网络的损失函数,我们继承Cox比例风险方法,将CPH模型的半参数特性应用于神经网络,结合偏似然模型构建损失函数,并进行反向传播与梯度更新。最后,我们对比了不同深度的ResSurv网络,发现该方法能够有效提取高维特征。消融实验与对比实验证明,我们的模型达到了深度学习领域的SOTA(当前最优水平),且网络能够有效提取深层信息。