Deep neural two-sample tests have recently shown strong power for detecting distributional differences between groups, yet their black-box nature limits interpretability and practical adoption in biomedical analysis. Moreover, most existing post-hoc explainability methods rely on class labels, making them unsuitable for label-free statistical testing settings. We propose an explainable deep statistical testing framework that augments deep two-sample tests with sample-level and feature-level explanations, revealing which individual samples and which input features drive statistically significant group differences. Our method highlights which image regions and which individual samples contribute most to the detected group difference, providing spatial and instance-wise insight into the test's decision. Applied to biomedical imaging data, the proposed framework identifies influential samples and highlights anatomically meaningful regions associated with disease-related variation. This work bridges statistical inference and explainable AI, enabling interpretable, label-free population analysis in medical imaging.
翻译:深度神经网络双样本检验近期在检测群体间分布差异方面展现出强大效力,但其黑箱特性限制了在生物医学分析中的可解释性与实际应用。此外,现有的大多数事后可解释性方法依赖于类别标签,使其不适用于无标签的统计检验场景。我们提出一种可解释的深度统计检验框架,通过样本级与特征级解释增强深度双样本检验,揭示哪些个体样本及哪些输入特征驱动了统计显著的群体差异。本方法能凸显对检测到的群体差异贡献最大的图像区域及个体样本,为检验决策提供空间维度与实例维度的洞察。应用于生物医学成像数据时,所提框架可识别具有影响力的样本,并突出与疾病相关变异对应的解剖学意义区域。本研究架起了统计推断与可解释人工智能之间的桥梁,为医学影像领域实现可解释的无标签群体分析提供了可能。