Harrel's concordance index is a commonly used discrimination metric for survival models, particularly for models where the relative ordering of the risk of individuals is time-independent, such as the proportional hazards model. There are several suggestions, but no consensus, on how it could be extended to models where relative risk can vary over time, e.g.\ in case of crossing hazard rates. We show that these concordance indices are not proper, in the sense that they are maximised in the limit by the true data generating model. Furthermore, we show that a concordance index is proper if and only if the risk score used is concordant with the hazard rate at the first event time for each comparable pair of events. Thus, we suggest using the hazard rate as the time-varying risk score when calculating concordance. Through simulations, we demonstrate situations in which other concordance indices can lead to incorrect models being selected over a true model, justifying the use of our suggested risk prediction in both model selection and in loss functions in, e.g., deep learning models.
翻译:Harrel的一致性指数是生存模型中常用的判别度量,尤其适用于个体风险相对排序不随时间变化的模型(如比例风险模型)。针对如何将其扩展至相对风险可随时间变化的模型(例如交叉风险率情形),目前存在多种建议但尚未达成共识。我们证明这些一致性指数并不适当——它们无法在极限情况下由真实数据生成模型实现最大化。进一步研究表明,当且仅当所使用的风险评分与每个可比较事件对中首个事件发生时的风险率保持一致时,该一致性指数才是适当的。因此,我们建议在计算一致性时采用风险率作为时变风险评分。通过模拟实验,我们展示了某些情况下其他一致性指数可能导致错误模型被选为优于真实模型,这验证了在模型选择以及深度学习模型等损失函数中采用我们建议的风险预测方法的合理性。