Model-agnostic explainable artificial intelligence for object detection in image data

Object detection is a fundamental task in computer vision, which has been greatly progressed through developing large and intricate deep learning models. However, the lack of transparency is a big challenge that may not allow the widespread adoption of these models. Explainable artificial intelligence is a field of research where methods are developed to help users understand the behavior, decision logics, and vulnerabilities of AI-based systems. Black-box explanation refers to explaining decisions of an AI system without having access to its internals. In this paper, we design and implement a black-box explanation method named Black-box Object Detection Explanation by Masking (BODEM) through adopting a new masking approach for AI-based object detection systems. We propose local and distant masking to generate multiple versions of an input image. Local masks are used to disturb pixels within a target object to figure out how the object detector reacts to these changes, while distant masks are used to assess how the detection model's decisions are affected by disturbing pixels outside the object. A saliency map is then created by estimating the importance of pixels through measuring the difference between the detection output before and after masking. Finally, a heatmap is created that visualizes how important pixels within the input image are to the detected objects. The experimentations on various object detection datasets and models showed that BODEM can be effectively used to explain the behavior of object detectors and reveal their vulnerabilities. This makes BODEM suitable for explaining and validating AI based object detection systems in black-box software testing scenarios. Furthermore, we conducted data augmentation experiments that showed local masks produced by BODEM can be used for further training the object detectors and improve their detection accuracy and robustness.

翻译：目标检测是计算机视觉中的基础任务，通过开发大规模深度神经网络模型已取得显著进展。然而，这些模型缺乏透明度是阻碍其广泛应用的主要挑战。可解释人工智能致力于开发帮助用户理解AI系统行为、决策逻辑及脆弱性的研究方法。黑盒解释指在无法访问AI系统内部结构的情况下解释其决策过程。本文设计并实现了一种名为BODEM（基于掩码的黑盒目标检测解释方法）的黑盒解释方法，通过采用新型掩码策略对基于AI的目标检测系统进行解释。我们提出局部掩码和远距离掩码两种机制来生成输入图像的多个变体：局部掩码通过扰动目标区域内的像素，观察检测器对该区域变化的响应；远距离掩码则通过扰动目标区域外的像素，评估检测模型决策受外部像素变化的影响程度。通过测量掩码前后检测输出的差异来估计像素重要性，进而生成显著性图。最后构建热力图可视化输入图像中对检测目标具有重要性的像素分布。在多个目标检测数据集和模型上的实验表明，BODEM能有效解释目标检测器的行为特性并揭示其脆弱性，使其适用于黑盒软件测试场景中基于AI的目标检测系统的解释与验证。此外，数据增强实验证实，BODEM生成的局部掩码可用于目标检测器的再训练，有效提升其检测精度和鲁棒性。