We introduce a gradient-free framework for identifying minimal, sufficient, and decision-preserving explanations in vision models by isolating the smallest subset of representational units whose joint activation preserves predictions. Unlike existing approaches that aggregate all units, often leading to cluttered saliency maps, our approach, DD-CAM, identifies a 1-minimal subset whose joint activation suffices to preserve the prediction (i.e., removing any unit from the subset alters the prediction). To efficiently isolate minimal sufficient subsets, we adapt delta debugging, a systematic reduction strategy from software debugging, and configure its search strategy based on unit interactions in the classifier head: testing individual units for models with non-interacting units and testing unit combinations for models in which unit interactions exist. We then generate minimal, prediction-preserving saliency maps that highlight only the most essential features. Our experimental evaluation demonstrates that our approach can produce more faithful explanations and achieve higher localization accuracy than the state-of-the-art CAM-based approaches.
翻译:我们提出了一种无需梯度的框架,用于识别视觉模型中的最小、充分且保持决策的解释,其方法是通过隔离表征单元的最小子集,该子集的联合激活能保持预测结果。与现有方法(通常聚合所有单元,常导致显著性图谱杂乱)不同,我们的方法DD-CAM识别出一个1-最小子集,其联合激活足以保持预测(即从该子集中移除任何单元都会改变预测)。为了高效地隔离最小充分子集,我们采用了Delta Debugging(一种源自软件调试的系统性约简策略),并根据分类器头中单元的交互情况配置其搜索策略:对于单元无交互的模型测试单个单元,对于存在单元交互的模型则测试单元组合。随后,我们生成仅突出最关键特征的最小化、保持预测的显著性图谱。实验评估表明,我们的方法能够产生比最先进的基于CAM的方法更忠实、定位更准确的解释。