Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy

2024 年 9 月 11 日

翻译：基于条件StyleGAN与潜在空间操控的可控视网膜图像合成以改进糖尿病视网膜病变诊断与分级

Somayeh Pakdelmoez,Saba Omidikia,Seyyed Ali Seyyedsalehi,Seyyede Zohreh Seyyedsalehi

from arxiv, 30 pages, 17 figures

Diabetic retinopathy (DR) is a consequence of diabetes mellitus characterized by vascular damage within the retinal tissue. Timely detection is paramount to mitigate the risk of vision loss. However, training robust grading models is hindered by a shortage of annotated data, particularly for severe cases. This paper proposes a framework for controllably generating high-fidelity and diverse DR fundus images, thereby improving classifier performance in DR grading and detection. We achieve comprehensive control over DR severity and visual features (optic disc, vessel structure, lesion areas) within generated images solely through a conditional StyleGAN, eliminating the need for feature masks or auxiliary networks. Specifically, leveraging the SeFa algorithm to identify meaningful semantics within the latent space, we manipulate the DR images generated conditionally on grades, further enhancing the dataset diversity. Additionally, we propose a novel, effective SeFa-based data augmentation strategy, helping the classifier focus on discriminative regions while ignoring redundant features. Using this approach, a ResNet50 model trained for DR detection achieves 98.09% accuracy, 99.44% specificity, 99.45% precision, and an F1-score of 98.09%. Moreover, incorporating synthetic images generated by conditional StyleGAN into ResNet50 training for DR grading yields 83.33% accuracy, a quadratic kappa score of 87.64%, 95.67% specificity, and 72.24% precision. Extensive experiments conducted on the APTOS 2019 dataset demonstrate the exceptional realism of the generated images and the superior performance of our classifier compared to recent studies.

翻译：糖尿病视网膜病变（DR）是糖尿病的一种并发症，其特征为视网膜组织内的血管损伤。及时检测对于降低视力丧失风险至关重要。然而，标注数据的缺乏，尤其是重症病例数据的匮乏，阻碍了鲁棒分级模型的训练。本文提出了一种可控生成高保真度、多样化DR眼底图像的框架，从而提升DR分级与检测中分类器的性能。我们仅通过条件StyleGAN即可实现对生成图像中DR严重程度及视觉特征（视盘、血管结构、病灶区域）的全面控制，无需特征掩码或辅助网络。具体而言，利用SeFa算法识别潜在空间中的有义语义，我们对基于分级条件生成的DR图像进行操控，进一步增强了数据集的多样性。此外，我们提出了一种新颖且有效的基于SeFa的数据增强策略，帮助分类器聚焦于判别性区域，同时忽略冗余特征。采用该方法，一个用于DR检测训练的ResNet50模型达到了98.09%的准确率、99.44%的特异性、99.45%的精确率以及98.09%的F1分数。此外，将条件StyleGAN生成的合成图像纳入ResNet50的DR分级训练中，获得了83.33%的准确率、87.64%的二次Kappa分数、95.67%的特异性和72.24%的精确率。在APTOS 2019数据集上进行的大量实验表明，所生成图像具有卓越的真实感，且我们的分类器性能优于近期研究。