Lung cancer remains one of the most common and deadliest forms of cancer worldwide. The likelihood of successful treatment depends strongly on the stage at which the disease is diagnosed. Therefore, early detection of lung cancer represents a critical medical challenge. However, this task poses significant difficulties for thoracic radiologists due to the large number of studies to review, the presence of multiple nodules within the lungs, and the small size of many nodules, which complicates visual assessment. Consequently, the development of automated systems that incorporate highly accurate and computationally efficient lung nodule detection and classification modules is essential. This study introduces three methodological improvements for lung nodule classification: (1) an advanced CT scan cropping strategy that focuses the model on the target nodule while reducing computational cost; (2) target filtering techniques for removing noisy labels; (3) novel augmentation methods to improve model robustness. The integration of these techniques enables the development of a robust classification subsystem within a comprehensive Clinical Decision Support System for lung cancer detection, capable of operating across diverse acquisition protocols, scanner types, and upstream models (segmentation or detection). The multiclass model achieved a Macro ROC AUC of 0.9176 and a Macro F1-score of 0.7658, while the binary model reached a Binary ROC AUC of 0.9383 and a Binary F1-score of 0.8668 on the LIDC-IDRI dataset. These results outperform several previously reported approaches and demonstrate state-of-the-art performance for this task.
翻译:肺癌仍是全球范围内最常见且致死率最高的癌症之一。治疗成功概率在很大程度上取决于疾病确诊时的分期。因此,肺癌的早期检测成为一项关键的医学挑战。然而,由于需要审阅的影像数量庞大、肺部存在多个结节以及许多结节尺寸微小导致视觉评估复杂化,这项任务对胸科放射科医生构成了显著困难。因此,开发集成高精度且计算高效的肺结节检测与分类模块的自动化系统至关重要。本研究提出了三项肺结节分类的方法学改进:(1) 先进的CT扫描裁剪策略,使模型聚焦于目标结节同时降低计算成本;(2) 用于去除噪声标签的目标过滤技术;(3) 提升模型鲁棒性的新型数据增强方法。通过整合这些技术,可在全面的肺癌检测临床决策支持系统中开发出鲁棒的分类子系统,该系统能够适应不同的采集协议、扫描仪类型及上游模型(分割或检测模型)。在LIDC-IDRI数据集上,多分类模型的宏观ROC AUC达到0.9176,宏观F1分数为0.7658;二分类模型的二值ROC AUC达到0.9383,二值F1分数为0.8668。这些结果超越了先前报道的多种方法,展现了该任务上的先进性能。