In the field of medical imaging, there are seldom large-scale public datasets with high-quality annotations due to data privacy and annotation cost. To address this issue, we release SynFundus-1M, a high-quality synthetic dataset containing over \textbf{1 million} fundus images w.r.t. 11 disease types. Moreover, we intentionally diversify the readability of the images and accordingly provide 4 types of the quality score for each image. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. All the images are generated by a Denoising Diffusion Probabilistic Model, named SynFundus-Generator. Trained with over 1.3 million private fundus images, our SynFundus-Generator achieves significant superior performance in generating fundus images compared to some recent related works. Furthermore, we blend some synthetic images from SynFundus-1M with real fundus images, and ophthalmologists can hardly distinguish the synthetic images from real ones. Through extensive experiments, we demonstrate that both convolutional neural networs (CNN) and Vision Transformer (ViT) can benefit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models trained on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.
翻译:在医学影像领域,由于数据隐私和标注成本限制,鲜有具备高质量标注的大规模公开数据集。为解决这一问题,我们发布了SynFundus-1M这一高质量合成数据集,包含超过\textbf{100万张}涵盖11种疾病类型的眼底图像。此外,我们特意使图像的易读性呈现多样性,并为每张图像提供4种类型的质量评分。据我们所知,SynFundus-1M是目前规模最大且标注最精细的眼底数据集。所有图像均由名为SynFundus-Generator的去噪扩散概率模型生成。该生成器基于超过130万张私有眼底图像训练,在眼底图像生成任务上显著优于近期相关研究。更进一步,我们将SynFundus-1M中的部分合成图像与真实眼底图像混合后,眼科医生难以区分合成图像与真实图像。通过大量实验证明,无论是卷积神经网络(CNN)还是Vision Transformer(ViT),都能通过预训练或直接训练的方式从SynFundus-1M中获益。与ImageNet或EyePACS等数据集相比,基于SynFundus-1M训练的模型不仅能在多种下游任务中取得更优性能,还能实现更快的收敛速度。