Perturbation-based explainability methods such as KernelSHAP provide model-agnostic attributions but are typically impractical for patch-based 3D medical image segmentation due to the large number of coalition evaluations and the high cost of sliding-window inference. We present an efficient KernelSHAP framework for volumetric CT segmentation that restricts computation to a user-defined region of interest and its receptive-field support, and accelerates inference via patch logit caching, reusing baseline predictions for unaffected patches while preserving nnU-Net's fusion scheme. To enable clinically meaningful attributions, we compare three automatically generated feature abstractions within the receptive-field crop: whole-organ units, regular FCC supervoxels, and hybrid organ-aware supervoxels, and we study multiple aggregation/value functions targeting stabilizing evidence (TP/Dice/Soft Dice) or false-positive behavior. Experiments on whole-body CT segmentations show that caching substantially reduces redundant computation (with computational savings ranging from 15% to 30%) and that faithfulness and interpretability exhibit clear trade-offs: regular supervoxels often maximize perturbation-based metrics but lack anatomical alignment, whereas organ-aware units yield more clinically interpretable explanations and are particularly effective for highlighting false-positive drivers under normalized metrics.
翻译:基于扰动的可解释性方法(如KernelSHAP)提供模型无关的属性归因,但由于需要大量联合评估且滑动窗口推理成本高昂,此类方法通常不适用于分块式三维医学图像分割。我们针对容积CT分割提出了一种高效的KernelSHAP框架,该框架将计算限制在用户定义的感兴趣区域及其感受野支撑范围内,并通过分块logit缓存加速推理——在保持nnU-Net融合方案的同时,复用未受影响分块的基线预测结果。为生成具有临床意义的属性归因,我们在感受野裁剪区域内对比了三种自动生成的特征抽象:全器官单元、规则FCC超体素以及混合器官感知超体素,并研究了多种针对稳定证据(TP/Dice/Soft Dice)或假阳性行为进行优化的聚合/价值函数。在全身CT分割实验表明,缓存机制大幅减少了冗余计算(计算节省量达15%至30%),且忠实度与可解释性呈现明显权衡:规则超体素通常能最大化基于扰动的指标但缺乏解剖对齐,而器官感知单元能产生更具临床可解释性的解释,且在归一化指标下对突出假阳性驱动因素尤为有效。