IP-UNet: Intensity Projection UNet Architecture for 3D Medical Volume Segmentation

CNNs have been widely applied for medical image analysis. However, limited memory capacity is one of the most common drawbacks of processing high-resolution 3D volumetric data. 3D volumes are usually cropped or downsized first before processing, which can result in a loss of resolution, increase class imbalance, and affect the performance of the segmentation algorithms. In this paper, we propose an end-to-end deep learning approach called IP-UNet. IP-UNet is a UNet-based model that performs multi-class segmentation on Intensity Projection (IP) of 3D volumetric data instead of the memory-consuming 3D volumes. IP-UNet uses limited memory capability for training without losing the original 3D image resolution. We compare the performance of three models in terms of segmentation accuracy and computational cost: 1) Slice-by-slice 2D segmentation of the CT scan images using a conventional 2D UNet model. 2) IP-UNet that operates on data obtained by merging the extracted Maximum Intensity Projection (MIP), Closest Vessel Projection (CVP), and Average Intensity Projection (AvgIP) representations of the source 3D volumes, then applying the UNet model on the output IP images. 3) 3D-UNet model directly reads the 3D volumes constructed from a series of CT scan images and outputs the 3D volume of the predicted segmentation. We test the performance of these methods on 3D volumetric images for automatic breast calcification detection. Experimental results show that IP-Unet can achieve similar segmentation accuracy with 3D-Unet but with much better performance. It reduces the training time by 70\% and memory consumption by 92\%.

翻译：卷积神经网络（CNNs）已广泛应用于医学图像分析。然而，有限的内存容量是处理高分辨率三维体积数据时最常见的缺陷之一。三维体积通常需要先进行裁剪或降采样处理，这会导致分辨率降低、类别不平衡加剧，并影响分割算法的性能。本文提出一种端到端深度学习方法IP-UNet。IP-UNet基于UNet模型，通过对三维体积数据的强度投影（IP）执行多类分割，而非直接处理高内存消耗的三维体积。该方法在保持原始三维图像分辨率的前提下，仅需有限的内存容量即可完成训练。我们比较了三种模型在分割精度和计算开销方面的性能：1）使用传统二维UNet模型逐切片分割CT扫描图像；2）IP-UNet通过融合原始三维体积的最大强度投影（MIP）、最接近血管投影（CVP）和平均强度投影（AvgIP）表示数据，将所获IP图像输入UNet模型；3）直接读取由系列CT扫描图像构建的三维体积并输出预测分割结果的3D-UNet模型。我们以乳腺钙化自动检测为应用场景，在三维体积图像上测试了这些方法。实验结果表明，IP-UNet能够达到与3D-UNet相当的分割精度，但性能显著更优：训练时间减少70%，内存消耗降低92%。