Acute Lymphoblastic Leukemia (ALL) is one of the most common types of childhood blood cancer. The quick start of the treatment process is critical to saving the patient's life, and for this reason, early diagnosis of this disease is essential. Examining the blood smear images of these patients is one of the methods used by expert doctors to diagnose this disease. Deep learning-based methods have numerous applications in medical fields, as they have significantly advanced in recent years. ALL diagnosis is not an exception in this field, and several machine learning-based methods for this problem have been proposed. In previous methods, high diagnostic accuracy was reported, but our work showed that this alone is not sufficient, as it can lead to models taking shortcuts and not making meaningful decisions. This issue arises due to the small size of medical training datasets. To address this, we constrained our model to follow a pipeline inspired by experts' work. We also demonstrated that, since a judgement based on only one image is insufficient, redefining the problem as a multiple-instance learning problem is necessary for achieving a practical result. Our model is the first to provide a solution to this problem in a multiple-instance learning setup. We introduced a novel pipeline for diagnosing ALL that approximates the process used by hematologists, is sensitive to disease biomarkers, and achieves an accuracy of 96.15%, an F1-score of 94.24%, a sensitivity of 97.56%, and a specificity of 90.91% on ALL IDB 1. Our method was further evaluated on an out-of-distribution dataset, which posed a challenging test and had acceptable performance. Notably, our model was trained on a relatively small dataset, highlighting the potential for our approach to be applied to other medical datasets with limited data availability.
翻译:急性淋巴细胞白血病(ALL)是儿童最常见的血癌类型之一。快速启动治疗过程对挽救患者生命至关重要,因此该疾病的早期诊断具有关键意义。检查这些患者的血涂片影像是专家医生诊断该疾病的方法之一。基于深度学习的方法在医学领域具有众多应用,近年来取得了显著进展。ALL诊断在该领域也不例外,目前已提出多种基于机器学习的方法。先前的研究报告了高诊断准确率,但我们的研究表明,仅凭高准确率并不足够,因为这可能导致模型走捷径而无法做出有意义的决策。该问题源于医学训练数据集规模较小。为解决这一问题,我们约束模型遵循专家工作启发式的流水线。我们还证明,由于仅基于单张图像的判断不够充分,为实现实用结果必须将问题重新定义为多实例学习问题。我们的模型是首个在多实例学习框架下为该问题提供解决方案的工作。我们引入了一种新型ALL诊断流水线,该流水线近似血液学家的工作流程,对疾病生物标志物敏感,并在ALL IDB 1数据集上实现了96.15%的准确率、94.24%的F1分数、97.56%的灵敏度和90.91%的特异性。该方法进一步在分布外数据集上进行了评估,该测试具有挑战性且表现可接受。值得注意的是,我们的模型是在相对较小的数据集上训练的,这凸显了该方法适用于其他数据有限的医学数据集的应用潜力。