Accurate lung nodule segmentation is crucial for early-stage lung cancer diagnosis, as it can substantially enhance patient survival rates. Computed tomography (CT) images are widely employed for early diagnosis in lung nodule analysis. However, the heterogeneity of lung nodules, size diversity, and the complexity of the surrounding environment pose challenges for developing robust nodule segmentation methods. In this study, we propose an efficient end-to-end framework, the multi-encoder-based self-adaptive hard attention network (MESAHA-Net), for precise lung nodule segmentation in CT scans. MESAHA-Net comprises three encoding paths, an attention block, and a decoder block, facilitating the integration of three types of inputs: CT slice patches, forward and backward maximum intensity projection (MIP) images, and region of interest (ROI) masks encompassing the nodule. By employing a novel adaptive hard attention mechanism, MESAHA-Net iteratively performs slice-by-slice 2D segmentation of lung nodules, focusing on the nodule region in each slice to generate 3D volumetric segmentation of lung nodules. The proposed framework has been comprehensively evaluated on the LIDC-IDRI dataset, the largest publicly available dataset for lung nodule segmentation. The results demonstrate that our approach is highly robust for various lung nodule types, outperforming previous state-of-the-art techniques in terms of segmentation accuracy and computational complexity, rendering it suitable for real-time clinical implementation.
翻译:精确的肺结节分割对于早期肺癌诊断至关重要,可显著提升患者生存率。计算机断层扫描(CT)图像被广泛用于肺结节分析的早期诊断。然而,肺结节的异质性、尺寸多样性及周围环境的复杂性,给鲁棒结节分割方法的开发带来了挑战。本研究提出了一种高效的端到端框架——基于多编码器的自适应硬注意力网络(MESAHA-Net),用于CT扫描中肺结节的精确分割。MESAHA-Net包含三个编码路径、一个注意力模块及一个解码模块,能够整合三类输入:CT切片补丁、前向与后向最大密度投影(MIP)图像、以及包含结节的感兴趣区域(ROI)掩膜。通过采用新颖的自适应硬注意力机制,MESAHA-Net逐切片迭代执行肺结节的二维分割,聚焦每一切片中的结节区域,从而生成肺结节的三维体积分割结果。该框架在最大公开肺结节分割数据集LIDC-IDRI上进行了全面评估。结果表明,本方法对各类肺结节类型具有高度鲁棒性,在分割精度与计算复杂度方面均超越现有最优技术,适用于实时临床部署。