Integrating deep learning with clinical expertise holds great potential for addressing healthcare challenges and empowering medical professionals with improved diagnostic tools. However, the need for annotated medical images is often an obstacle to leveraging the full power of machine learning models. Our research demonstrates that by combining synthetic images, generated using diffusion models, with real images, we can enhance nonalcoholic fatty liver disease (NAFLD) classification performance even in low-data regime settings. We evaluate the quality of the synthetic images by comparing two metrics: Inception Score (IS) and Fr\'{e}chet Inception Distance (FID), computed on diffusion- and generative adversarial network (GAN)-generated images. Our results show superior performance for the diffusion-generated images, with a maximum IS score of $1.90$ compared to $1.67$ for GANs, and a minimum FID score of $69.45$ compared to $100.05$ for GANs. Utilizing a partially frozen CNN backbone (EfficientNet v1), our synthetic augmentation method achieves a maximum image-level ROC AUC of $0.904$ on a NAFLD prediction task.
翻译:将深度学习与临床专业知识相结合,在应对医疗挑战和通过改进诊断工具赋能医疗专业人员方面具有巨大潜力。然而,标注医学图像的需求常常成为充分发挥机器学习模型能力的障碍。我们的研究表明,通过将扩散模型生成的合成图像与真实图像相结合,即使在低数据场景下,也能提高非酒精性脂肪肝病(NAFLD)的分类性能。我们通过比较两个指标(在扩散模型和生成对抗网络(GAN)生成图像上计算的Inception Score(IS)和Fr´echet Inception Distance(FID))来评估合成图像的质量。结果显示,扩散生成图像性能更优,其最大IS分数为$1.90$(GAN为$1.67$),最小FID分数为$69.45$(GAN为$100.05$)。利用部分冻结的CNN骨干网络(EfficientNet v1),我们的合成增强方法在NAFLD预测任务中实现了$0.904$的最大图像级ROC AUC值。