Data-driven deep learning models have shown great capabilities to assist radiologists in breast ultrasound (US) diagnoses. However, their effectiveness is limited by the long-tail distribution of training data, which leads to inaccuracies in rare cases. In this study, we address a long-standing challenge of improving the diagnostic model performance on rare cases using long-tailed data. Specifically, we introduce a pipeline, TAILOR, that builds a knowledge-driven generative model to produce tailored synthetic data. The generative model, using 3,749 lesions as source data, can generate millions of breast-US images, especially for error-prone rare cases. The generated data can be further used to build a diagnostic model for accurate and interpretable diagnoses. In the prospective external evaluation, our diagnostic model outperforms the average performance of nine radiologists by 33.5% in specificity with the same sensitivity, improving their performance by providing predictions with an interpretable decision-making process. Moreover, on ductal carcinoma in situ (DCIS), our diagnostic model outperforms all radiologists by a large margin, with only 34 DCIS lesions in the source data. We believe that TAILOR can potentially be extended to various diseases and imaging modalities.
翻译:数据驱动的深度学习模型已展现出辅助放射科医师进行乳腺超声诊断的强大能力。然而,其有效性受限于训练数据的长尾分布,这导致模型在罕见病例上诊断不准确。本研究致力于解决利用长尾数据提升诊断模型在罕见病例上性能这一长期挑战。具体而言,我们提出了一个名为TAILOR的流程,该流程构建了一个知识驱动的生成模型以产生定制的合成数据。该生成模型以3,749个病灶作为源数据,能够生成数百万张乳腺超声图像,特别是针对易出错的罕见病例。生成的数据可进一步用于构建诊断模型,以实现精准且可解释的诊断。在前瞻性外部评估中,在保持相同敏感度的前提下,我们的诊断模型在特异度上比九位放射科医师的平均表现高出33.5%,并通过提供具有可解释决策过程的预测来提升医师的诊断水平。此外,对于导管原位癌,尽管源数据中仅有34个DCIS病灶,我们的诊断模型仍以显著优势超越了所有放射科医师的表现。我们相信TAILOR有潜力扩展到多种疾病和成像模态。