We present a reproducible deep learning pipeline for leukemic cell classification, focusing on system architecture, experimental robustness, and software design choices for medical image analysis. Acute lymphoblastic leukemia (ALL) is the most common childhood cancer, requiring expert microscopic diagnosis that suffers from inter-observer variability and time constraints. The proposed system integrates an attention-based convolutional neural network combining EfficientNetV2-B3 with Squeeze-and-Excitation mechanisms for automated ALL cell classification. Our approach employs comprehensive data augmentation, focal loss for class imbalance, and patient-wise data splitting to ensure robust and reproducible evaluation. On the C-NMC 2019 dataset (12,528 original images from 62 patients), the system achieves a 97.89% F1-score and 97.89% accuracy on the test set, with statistical validation through 100-iteration Monte Carlo experiments confirming significant improvements (p < 0.001) over baseline methods. The proposed pipeline outperforms existing approaches by up to 4.67% while using 89% fewer parameters than VGG16 (15.2M vs. 138M). The attention mechanism provides interpretable visualizations of diagnostically relevant cellular features, demonstrating that modern attention-based architectures can improve leukemic cell classification while maintaining computational efficiency suitable for clinical deployment.
翻译:我们提出了一种可复现的深度学习流程用于白血病细胞分类,重点关注医学图像分析中的系统架构、实验鲁棒性和软件设计选择。急性淋巴细胞白血病(ALL)是最常见的儿童癌症,其诊断依赖专家显微观察,但存在观察者间差异和时间限制等问题。本系统集成了基于注意力机制的卷积神经网络,将EfficientNetV2-B3与Squeeze-and-Excitation机制相结合,实现ALL细胞的自动分类。该方法采用综合数据增强技术、针对类别不平衡的焦点损失函数以及按患者划分数据的策略,以确保评估的鲁棒性和可复现性。在C-NMC 2019数据集(来自62名患者的12,528张原始图像)上,该系统在测试集上取得了97.89%的F1分数和97.89%的准确率。通过100次蒙特卡洛实验的统计验证表明,相较于基线方法有显著改进(p < 0.001)。所提出的流程在参数量比VGG16减少89%(1520万 vs 1.38亿)的情况下,性能优于现有方法达4.67%。注意力机制提供了可解释的诊断相关细胞特征可视化,证明基于注意力的现代架构在保持临床部署所需计算效率的同时,能够有效提升白血病细胞分类性能。