Survival analysis is a branch of statistics used for modeling the time until a specific event occurs and is widely used in medicine, engineering, finance, and many other fields. When choosing survival models, there is typically a trade-off between performance and interpretability, where the highest performance is achieved by black-box models based on deep learning. This is a major problem in fields such as medicine where practitioners are reluctant to blindly trust black-box models to make important patient decisions. Kolmogorov-Arnold Networks (KANs) were recently proposed as an interpretable and accurate alternative to multi-layer perceptrons (MLPs). We introduce CoxKAN, a Cox proportional hazards Kolmogorov-Arnold Network for interpretable, high-performance survival analysis. We evaluate the proposed CoxKAN on 4 synthetic datasets and 9 real medical datasets. The synthetic experiments demonstrate that CoxKAN accurately recovers interpretable symbolic formulae for the hazard function, and effectively performs automatic feature selection. Evaluation on the 9 real datasets show that CoxKAN consistently outperforms the Cox proportional hazards model and achieves performance that is superior or comparable to that of tuned MLPs. Furthermore, we find that CoxKAN identifies complex interactions between predictor variables that would be extremely difficult to recognise using existing survival methods, and automatically finds symbolic formulae which uncover the precise effect of important biomarkers on patient risk.
翻译:生存分析是统计学的一个分支,用于对特定事件发生前的时间进行建模,广泛应用于医学、工程、金融等诸多领域。在选择生存模型时,通常需要在性能与可解释性之间进行权衡,其中基于深度学习的黑盒模型能够实现最高性能。这在医学等领域是一个主要问题,因为从业者不愿盲目信任黑盒模型来做出重要的患者决策。Kolmogorov-Arnold网络(KANs)最近被提出,作为多层感知机(MLPs)的一种可解释且精确的替代方案。我们提出了CoxKAN,一种用于可解释高性能生存分析的Cox比例风险Kolmogorov-Arnold网络。我们在4个合成数据集和9个真实医学数据集上评估了所提出的CoxKAN。合成实验表明,CoxKAN能够准确恢复风险函数的可解释符号公式,并有效执行自动特征选择。在9个真实数据集上的评估显示,CoxKAN始终优于Cox比例风险模型,并且其性能优于或与经过调优的MLPs相当。此外,我们发现CoxKAN能够识别预测变量之间复杂的交互作用,这些交互作用使用现有的生存方法极难识别,并且能够自动发现揭示重要生物标志物对患者风险精确影响的符号公式。