In many applications, it is important to identify subpopulations that survive longer or shorter than the rest of the population. In medicine, for example, it allows determining which patients benefit from treatment, and in predictive maintenance, which components are more likely to fail. Existing methods for discovering subgroups with exceptional survival characteristics rely on restrictive assumptions about the survival model (e.g. proportional hazards), require pre-discretized features, and, as they compare average statistics, tend to overlook individual heterogeneity. In this paper, we propose Sysurv, a non-parametric, fully differentiable method that discovers human-readable rules selecting subgroups with exceptional survival characteristics. Empirical evaluation on a wide range of datasets and settings, including a case study on cancer data, shows that Sysurv reveals insightful and actionable survival subgroups, outperforming the state of the art.
翻译:在许多应用中,识别出比总体生存时间更长或更短的子群具有重要意义。例如在医学领域,这有助于确定哪些患者能从治疗中获益;在预测性维护中,则可判断哪些组件更易发生故障。现有发现具有异常生存特征子群的方法依赖于对生存模型的限制性假设(如比例风险假设),需要预先离散化特征,并且由于比较的是平均统计量,往往忽略了个体异质性。本文提出Sysurv——一种非参数、完全可微的方法,能够发现选择具有异常生存特征子群的人类可读规则。在包括癌症数据案例研究在内的广泛数据集及场景下的实证评估表明,Sysurv能揭示具有洞察力且可操作的生存子群,性能优于现有最优方法。