We investigate the use of a stratified sampling approach for LIME Image, a popular model-agnostic explainable AI method for computer vision tasks, in order to reduce the artifacts generated by typical Monte Carlo sampling. Such artifacts are due to the undersampling of the dependent variable in the synthetic neighborhood around the image being explained, which may result in inadequate explanations due to the impossibility of fitting a linear regressor on the sampled data. We then highlight a connection with the Shapley theory, where similar arguments about undersampling and sample relevance were suggested in the past. We derive all the formulas and adjustment factors required for an unbiased stratified sampling estimator. Experiments show the efficacy of the proposed approach.
翻译:我们研究在LIME Image(一种面向计算机视觉任务的通用可解释人工智能方法)中采用分层抽样策略,以减少典型蒙特卡洛抽样产生的伪影。此类伪影源于被解释图像合成邻域内因变量的欠采样,可能导致无法通过拟合线性回归模型处理采样数据,从而产生不充分的解释。随后,我们揭示了该方法与Shapley理论之间的联系——该理论过去曾提出关于欠采样和样本相关性的类似论点。我们推导了无偏分层抽样估计所需的所有公式和调整因子。实验证明了所提方法的有效性。