This paper introduces an efficient sub-model ensemble framework aimed at enhancing the interpretability of medical deep learning models, thus increasing their clinical applicability. By generating uncertainty maps, this framework enables end-users to evaluate the reliability of model outputs. We developed a strategy to develop diverse models from a single well-trained checkpoint, facilitating the training of a model family. This involves producing multiple outputs from a single input, fusing them into a final output, and estimating uncertainty based on output disagreements. Implemented using U-Net and UNETR models for segmentation and synthesis tasks, this approach was tested on CT body segmentation and MR-CT synthesis datasets. It achieved a mean Dice coefficient of 0.814 in segmentation and a Mean Absolute Error of 88.17 HU in synthesis, improved from 89.43 HU by pruning. Additionally, the framework was evaluated under corruption and undersampling, maintaining correlation between uncertainty and error, which highlights its robustness. These results suggest that the proposed approach not only maintains the performance of well-trained models but also enhances interpretability through effective uncertainty estimation, applicable to both convolutional and transformer models in a range of imaging tasks.
翻译:本文提出了一种高效的子模型集成框架,旨在增强医学深度学习模型的可解释性,从而提高其临床适用性。该框架通过生成不确定性图谱,使终端用户能够评估模型输出的可靠性。我们开发了一种从单一训练良好的检查点生成多样化模型的策略,从而促进模型家族的训练。该策略涉及从单一输入生成多个输出,将其融合为最终输出,并基于输出间的差异来估计不确定性。该方法使用U-Net和UNETR模型在分割与合成任务中实现,并在CT体部分割与MR-CT合成数据集上进行了测试。其在分割任务中取得了0.814的平均Dice系数,在合成任务中取得了88.17 HU的平均绝对误差(通过剪枝从89.43 HU改进)。此外,该框架在数据损坏和欠采样条件下进行了评估,不确定性仍与误差保持相关性,这突显了其鲁棒性。这些结果表明,所提出的方法不仅保持了训练良好模型的性能,而且通过有效的估计不确定性增强了可解释性,可适用于一系列成像任务中的卷积模型和Transformer模型。