We present a model-agnostic algorithm for generating post-hoc explanations and uncertainty intervals for a machine learning model when only a static sample of inputs and outputs from the model is available, rather than direct access to the model itself. This situation may arise when model evaluations are expensive; when privacy, security and bandwidth constraints are imposed; or when there is a need for real-time, on-device explanations. Our algorithm uses a bootstrapping approach to quantify the uncertainty that inevitably arises when generating explanations from a finite sample of model queries. Through a simulation study, we show that the uncertainty intervals generated by our algorithm exhibit a favorable trade-off between interval width and coverage probability compared to the naive confidence intervals from classical regression analysis as well as current Bayesian approaches for quantifying explanation uncertainty. We further demonstrate the capabilities of our method by applying it to black-box models, including a deep neural network, trained on three real-world datasets.
翻译:我们提出了一种模型无关算法,用于在仅能获取模型输入输出的静态样本、而非直接访问模型本身的情况下,生成机器学习模型的事后解释及其不确定性区间。这种情况可能出现在:模型评估成本高昂;存在隐私、安全或带宽约束;或需要实时设备端解释时。该算法采用自举方法量化从有限模型查询样本生成解释时必然产生的不确定性。通过仿真研究,我们证明相较于经典回归分析的朴素置信区间以及当前用于量化解释不确定性的贝叶斯方法,该算法生成的不确定性区间在区间宽度与覆盖概率之间展现出更优的权衡。我们进一步将方法应用于基于三个真实数据集训练的深度神经网络等黑盒模型,验证了其能力。