Recent works have extended notions of feature importance to semantic concepts that are inherently interpretable to the users interacting with a black-box predictive model. Yet, precise statistical guarantees, such as false positive rate and false discovery rate control, are needed to communicate findings transparently and to avoid unintended consequences in real-world scenarios. In this paper, we formalize the global (i.e., over a population) and local (i.e., for a sample) statistical importance of semantic concepts for the predictions of opaque models by means of conditional independence, which allows for rigorous testing. We use recent ideas of sequential kernelized independence testing (SKIT) to induce a rank of importance across concepts, and showcase the effectiveness and flexibility of our framework on synthetic datasets as well as on image classification tasks using several and diverse vision-language models.
翻译:近期研究已将特征重要性的概念扩展到对用户而言具有内在可解释性的语义概念,这些用户与黑盒预测模型进行交互。然而,为了透明地传达发现并避免在现实场景中产生意外后果,需要精确的统计保证,例如误报率和错误发现率控制。在本文中,我们通过条件独立性形式化地定义了语义概念对于不透明模型预测的全局(即针对总体)和局部(即针对样本)统计重要性,从而支持严格的检验。我们利用序列核化独立性检验(SKIT)的最新思想,推导出跨概念的重要性排序,并在合成数据集以及使用多种多样化视觉-语言模型的图像分类任务上展示了我们框架的有效性和灵活性。