In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data. Our implementation takes advantage of the parallelisation paradigm as all computations are performed in a matrix-wise fashion which speeds up execution time. Additionally, SurvLIMEpy assists the user with visualization tools to better understand the result of the algorithm. The package supports a wide variety of survival models, from the Cox Proportional Hazards Model to deep learning models such as DeepHit or DeepSurv. Two types of experiments are presented in this paper. First, by means of simulated data, we study the ability of the algorithm to capture the importance of the features. Second, we use three open source survival datasets together with a set of survival algorithms in order to demonstrate how SurvLIMEpy behaves when applied to different models.
翻译:本文介绍SurvLIMEpy——一个开源的Python包,用于实现SurvLIME算法。该方法能够计算专为生存分析数据建模的机器学习算法的局部特征重要性。本实现充分利用并行化范式,所有计算均以矩阵方式执行,从而加速运行时间。此外,SurvLIMEpy通过可视化工具帮助用户更好地理解算法结果。该包支持多种生存模型,从Cox比例风险模型到深度模型(如DeepHit或DeepSurv)。本文呈现了两类实验:首先,通过模拟数据研究算法捕获特征重要性的能力;其次,使用三个开源生存数据集及一组生存算法,展示SurvLIMEpy应用于不同模型时的表现。