Explaining deep learning models is essential for clinical integration of medical image analysis systems. A good explanation highlights if a model depends on spurious features that undermines generalization and harms a subset of patients or, conversely, may present novel biological insights. Although techniques like GradCAM can identify influential features, they are measurement tools that do not themselves form an explanation. We propose a human-machine-VLM interaction system tailored to explaining classifiers in computational pathology, including multi-instance learning for whole-slide images. Our proof of concept comprises (1) an AI-integrated slide viewer to run sliding-window experiments to test claims of an explanation, and (2) quantification of an explanation's predictiveness using general-purpose vision-language models. The results demonstrate that this allows us to qualitatively test claims of explanations and can quantifiably distinguish competing explanations. This offers a practical path from explainable AI to explained AI in digital pathology and beyond. Code and prompts are available at https://github.com/nki-ai/x2x.
翻译:解释深度学习模型对于医学影像分析系统的临床整合至关重要。良好的解释能够突显模型是否依赖于有损泛化能力并危害特定患者群体的伪特征,抑或可能揭示新的生物学洞见。尽管诸如GradCAM等技术能够识别关键特征,但这些方法本质上是测量工具,其本身并不构成解释。我们提出一种面向计算病理学分类器解释的人机-VLM交互系统,包括适用于全切片图像的多示例学习框架。概念验证系统包含:(1) 集成AI的切片浏览工具,可通过滑动窗口实验验证解释主张;(2) 利用通用视觉语言模型量化解释的预测能力。实验结果表明,该系统支持对解释主张进行定性检验,并能量化区分不同解释方案。这为数字病理学及其他领域实现从可解释AI到已解释AI的转化提供了可行路径。代码与提示词详见 https://github.com/nki-ai/x2x。