M3BUNet: Mobile Mean Max UNet for Pancreas Segmentation on CT-Scans

Segmenting organs in CT scan images is a necessary process for multiple downstream medical image analysis tasks. Currently, manual CT scan segmentation by radiologists is prevalent, especially for organs like the pancreas, which requires a high level of domain expertise for reliable segmentation due to factors like small organ size, occlusion, and varying shapes. When resorting to automated pancreas segmentation, these factors translate to limited reliable labeled data to train effective segmentation models. Consequently, the performance of contemporary pancreas segmentation models is still not within acceptable ranges. To improve that, we propose M3BUNet, a fusion of MobileNet and U-Net neural networks, equipped with a novel Mean-Max (MM) attention that operates in two stages to gradually segment pancreas CT images from coarse to fine with mask guidance for object detection. This approach empowers the network to surpass segmentation performance achieved by similar network architectures and achieve results that are on par with complex state-of-the-art methods, all while maintaining a low parameter count. Additionally, we introduce external contour segmentation as a preprocessing step for the coarse stage to assist in the segmentation process through image standardization. For the fine segmentation stage, we found that applying a wavelet decomposition filter to create multi-input images enhances pancreas segmentation performance. We extensively evaluate our approach on the widely known NIH pancreas dataset and MSD pancreas dataset. Our approach demonstrates a considerable performance improvement, achieving an average Dice Similarity Coefficient (DSC) value of up to 89.53% and an Intersection Over Union (IOU) score of up to 81.16 for the NIH pancreas dataset, and 88.60% DSC and 79.90% IOU for the MSD Pancreas dataset.

翻译：在CT扫描图像中分割器官是多种下游医学图像分析任务的必要流程。目前，放射科医生手动分割CT扫描图像仍较为普遍，尤其是对胰腺等器官的分割，因器官体积小、存在遮挡及形态多变等因素，需要高度领域专业知识才能实现可靠分割。当采用自动胰腺分割时，这些因素导致用于训练有效分割模型的可靠标注数据极为有限。因此，当前胰腺分割模型的性能仍未达到可接受范围。为改善这一现状，我们提出M3BUNet——一种融合MobileNet与U-Net神经网络的模型，配备创新性的均值-最大（Mean-Max, MM）注意力机制，通过两阶段操作从粗到细逐步分割胰腺CT图像，并利用掩码引导目标检测。该方法使网络在保持低参数量的同时，超越同类网络架构的分割性能，达到与复杂先进方法相媲美的结果。此外，我们引入外部轮廓分割作为粗阶段预处理步骤，通过图像标准化辅助分割过程。在细分割阶段，我们发现应用小波分解滤波器生成多输入图像可提升胰腺分割性能。我们在广泛认可的NIH胰腺数据集和MSD胰腺数据集上对该方法进行了全面评估。结果表明，我们的方法性能显著提升：在NIH胰腺数据集上，平均Dice相似系数（DSC）值最高达89.53%，交并比（IOU）得分最高达81.16；在MSD胰腺数据集上，DSC达到88.60%，IOU达到79.90%。