Bone marrow smear review remains important for acute myeloid leukemia (AML) assessment, but manual single-cell interpretation is labor-intensive and patient-level diagnosis requires aggregation of many cellular observations. We present a cell-to-patient deep learning pipeline for AML-assisted diagnosis from bone marrow smear images. The study included 258 patients from six anonymized centers, including a main cohort of 169 patients from Centers 1-3 and an external validation cohort of 89 patients from Centers 4-6. A 16-category cell annotation vocabulary was used to describe the global cellular composition, including granulocytic, monocytic, erythroid, lymphoid, eosinophilic, and other cells. Rather than identifying strict AML blasts or leukemic blasts, the model targets an expert-defined composite category termed Composite Blast-like Cells (CBLC), comprising N, N1, M, M1, R, R1, J, and J1 according to the project-wide morphological standard. A fixed YOLO-based segmentation module detected cells, predicted contours were matched to expert polygon annotations by contour IoU, and standardized single-cell crops were generated. An EfficientNet-B0 classifier was trained through a two-stage GT-to-YOLO and YOLO-to-YOLO strategy with class-imbalance correction, center-border regularization, and morphology-assisted supervision. Cell-level predictions were aggregated into patient-level CBLC ratios for AML-oriented diagnostic support. The pipeline achieved stable internal validation and maintained external generalization, with ensemble weighted F1-scores of 0.9076, 0.8696, and 0.9124 on Centers 4, 5, and 6, respectively.
翻译:骨髓涂片复核在急性髓系白血病(AML)评估中仍至关重要,但人工单细胞解读劳动密集,且患者级别诊断需要整合大量细胞观察结果。我们提出了一种从骨髓涂片图像辅助诊断AML的细胞至患者深度学习流程。本研究纳入来自六个匿名中心的258例患者,包括中心1-3的169例患者主队列及中心4-6的89例外部验证队列。采用16类细胞标注词汇描述整体细胞组成,涵盖粒细胞系、单核细胞系、红细胞系、淋巴细胞系、嗜酸性粒细胞系及其他细胞。该模型并非识别严格的AML原始细胞或白血病原始细胞,而是针对专家定义的综合类别——复合原始样细胞(CBLC),根据项目统一形态学标准包含N、N1、M、M1、R、R1、J和J1八类。基于固定YOLO的分割模块检测细胞,通过轮廓交并比将预测轮廓与专家多边形标注匹配,并生成标准化单细胞图像块。通过两阶段GT-to-YOLO和YOLO-to-YOLO策略训练EfficientNet-B0分类器,融入类别不平衡校正、中心-边界正则化及形态辅助监督。将细胞级别预测聚合为患者级别CBLC比值,用于AML导向诊断支持。该流程在内部验证中表现稳定,并维持了外部泛化能力,在中心4、5和6上的集成加权F1分数分别为0.9076、0.8696和0.9124。