In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data. Our implementation takes advantage of the parallelisation paradigm as all computations are performed in a matrix-wise fashion which speeds up execution time. Additionally, SurvLIMEpy assists the user with visualization tools to better understand the result of the algorithm. The package supports a wide variety of survival models, from the Cox Proportional Hazards Model to deep learning models such as DeepHit or DeepSurv. Two types of experiments are presented in this paper. First, by means of simulated data, we study the ability of the algorithm to capture the importance of the features. Second, we use three open source survival datasets together with a set of survival algorithms in order to demonstrate how SurvLIMEpy behaves when applied to different models.
翻译:本文介绍SurvLIMEpy,这是一个实现SurvLIME算法的开源Python软件包。该方法能够为针对生存分析数据建模的机器学习算法计算局部特征重要性。我们的实现利用了并行化范式,所有计算均采用矩阵化方式进行,从而提升了执行速度。此外,SurvLIMEpy通过可视化工具辅助用户更好地理解算法结果。该软件包支持多种生存模型,从Cox比例风险模型到深度模型(如DeepHit或DeepSurv)均可涵盖。本文呈现两类实验:首先,通过模拟数据研究算法捕获特征重要性的能力;其次,使用三个开源生存数据集结合多种生存算法,展示SurvLIMEpy应用于不同模型时的表现。