In the field of medical imaging, the scarcity of large-scale datasets due to privacy restrictions stands as a significant barrier to develop large models for medical. To address this issue, we introduce SynFundus-1M, a high-quality synthetic dataset with over 1 million retinal fundus images and extensive disease and pathologies annotations, which is generated by a Denoising Diffusion Probabilistic Model. The SynFundus-Generator and SynFundus-1M achieve superior Frechet Inception Distance (FID) scores compared to existing methods on main-stream public real datasets. Furthermore, the ophthalmologists evaluation validate the difficulty in discerning these synthetic images from real ones, confirming the SynFundus-1M's authenticity. Through extensive experiments, we demonstrate that both CNN and ViT can benifit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models train on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.
翻译:在医学影像领域,因隐私限制导致的大规模数据集匮乏,是开发医学大模型的主要障碍。为解决这一问题,我们提出SynFundus-1M——一个包含超过100万张视网膜眼底图像及丰富疾病与病理标注的高质量合成数据集,该数据集由去噪扩散概率模型生成。与现有方法相比,SynFundus生成器与SynFundus-1M在主流公开真实数据集上取得了更优的Fréchet初始距离(FID)评分。此外,眼科医生的评估验证了区分这些合成图像与真实图像的难度,证实了SynFundus-1M的真实性。通过大量实验,我们证明CNN和ViT均可通过预训练或直接训练从SynFundus-1M中获益。相比ImageNet或EyePACS等数据集,基于SynFundus-1M训练的模型不仅在多种下游任务中实现了更优性能,且收敛速度更快。