The Concordance Index (C-index) is a commonly used metric in Survival Analysis for evaluating the performance of a prediction model. In this paper, we propose a decomposition of the C-index into a weighted harmonic mean of two quantities: one for ranking observed events versus other observed events, and the other for ranking observed events versus censored cases. This decomposition enables a finer-grained analysis of the relative strengths and weaknesses between different survival prediction methods. The usefulness of this decomposition is demonstrated through benchmark comparisons against classical models and state-of-the-art methods, together with the new variational generative neural-network-based method (SurVED) proposed in this paper. The performance of the models is assessed using four publicly available datasets with varying levels of censoring. Using the C-index decomposition and synthetic censoring, the analysis shows that deep learning models utilize the observed events more effectively than other models. This allows them to keep a stable C-index in different censoring levels. In contrast to such deep learning methods, classical machine learning models deteriorate when the censoring level decreases due to their inability to improve on ranking the events versus other events.
翻译:一致性指数(C-index)是生存分析中用于评估预测模型性能的常用指标。本文提出将C-index分解为两个分量的加权调和平均值:一个分量用于对观测事件与其他观测事件进行排序,另一个分量用于对观测事件与删失案例进行排序。这种分解能够对不同的生存预测方法的相对优劣进行更细致的分析。我们通过基准测试,将经典模型、最新方法以及本文提出的基于变分生成神经网络的新方法(SurVED)进行对比,证明了这种分解的有效性。使用四个具有不同删失程度的公开数据集评估模型性能。通过C-index分解和合成删失分析表明,深度学习模型比其他模型更有效地利用观测事件,从而在不同删失水平下保持稳定的C-index。相比之下,经典机器学习模型在删失水平降低时性能下降,这是因为它们无法提升事件与其他事件的排序能力。