The availability of training data is one of the main limitations in deep learning applications for medical imaging. Data augmentation is a popular approach to overcome this problem. A new approach is a Machine Learning based augmentation, in particular usage of Generative Adversarial Networks (GAN). In this case, GANs generate images similar to the original dataset so that the overall training data amount is bigger, which leads to better performance of trained networks. A GAN model consists of two networks, a generator and a discriminator interconnected in a feedback loop which creates a competitive environment. This work is a continuation of the previous research where we trained StyleGAN2-ADA by Nvidia on the limited COVID-19 chest X-ray image dataset. In this paper, we study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. Two datasets are considered, one with 1000 images per class (4000 images in total) and the second with 500 images per class (2000 images in total). We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. We compare the quality of the GAN-based augmentation approach to two different approaches (classical augmentation and no augmentation at all) by employing transfer learning-based classification of COVID-19 chest X-ray images. The results are quantified using different classification quality metrics and compared to the results from the literature. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets. The correlation between the size of the original dataset and the quality of classification is visible independently from the augmentation approach.
翻译:训练数据的可用性是医学影像深度学习应用的主要限制之一。数据增强是解决该问题的常用方法。基于机器学习的增强技术,特别是生成对抗网络(GAN)的应用,为这一问题提供了新思路。在这种情况下,GAN生成与原始数据集相似的图像,从而增加整体训练数据量,进而提升训练网络的性能。GAN模型由两个网络组成:生成器和判别器,它们通过反馈回路相互连接,形成竞争环境。本研究是前期工作的延续,我们曾使用英伟达开发的StyleGAN2-ADA在有限的COVID-19胸部X光图像数据集上进行训练。本文重点研究基于GAN的增强性能与数据集规模之间的依赖关系,尤其关注小样本场景。我们采用两个数据集:每个类别1000张图像(共4000张图像)和每个类别500张图像(共2000张图像)。分别使用这两个数据集训练StyleGAN2-ADA,在验证生成图像质量后,将训练好的GAN作为多分类问题中的增强方法之一。通过迁移学习对COVID-19胸部X光图像进行分类,我们将基于GAN的增强方法与两种不同方法(经典增强和无增强)进行质量比较。采用多种分类质量指标对结果进行量化,并与文献结果进行对比。研究发现,在中大型数据集上,基于GAN的增强方法与经典增强效果相当,但在较小数据集上表现欠佳。原始数据集规模与分类质量之间的相关性独立于增强方法。