An explanation method called SurvBeX is proposed to interpret predictions of the machine learning survival black-box models. The main idea behind the method is to use the modified Beran estimator as the surrogate explanation model. Coefficients, incorporated into Beran estimator, can be regarded as values of the feature impacts on the black-box model prediction. Following the well-known LIME method, many points are generated in a local area around an example of interest. For every generated example, the survival function of the black-box model is computed, and the survival function of the surrogate model (the Beran estimator) is constructed as a function of the explanation coefficients. In order to find the explanation coefficients, it is proposed to minimize the mean distance between the survival functions of the black-box model and the Beran estimator produced by the generated examples. Many numerical experiments with synthetic and real survival data demonstrate the SurvBeX efficiency and compare the method with the well-known method SurvLIME. The method is also compared with the method SurvSHAP. The code implementing SurvBeX is available at: https://github.com/DanilaEremenko/SurvBeX
翻译:本文提出了一种名为SurvBeX的解释方法,用于解释机器学习生存黑箱模型的预测结果。该方法的核心思想是采用改进的Beran估计作为替代解释模型。嵌入Beran估计中的系数可视为特征对黑箱模型预测的影响值。遵循著名的LIME方法,我们在待解释样本的局部邻域内生成大量点。对于每个生成的样本,计算黑箱模型的生存函数,并将替代模型(Beran估计)的生存函数表示为解释系数的函数。为求解解释系数,本文提出通过最小化由生成样本产生的黑箱模型与Beran估计的生存函数之间的平均距离。大量基于合成和真实生存数据的数值实验展示了SurvBeX的有效性,并将该方法与知名方法SurvLIME进行了比较。此外,该方法还与SurvSHAP方法进行了对比。实现SurvBeX的代码可从以下网址获取:https://github.com/DanilaEremenko/SurvBeX