A Hybrid Approach for COVID-19 Detection: Combining Wasserstein GAN with Transfer Learning

COVID-19 is extremely contagious and its rapid growth has drawn attention towards its early diagnosis. Early diagnosis of COVID-19 enables healthcare professionals and government authorities to break the chain of transition and flatten the epidemic curve. With the number of cases accelerating across the developed world, COVID-19 induced Viral Pneumonia cases is a big challenge. Overlapping of COVID-19 cases with Viral Pneumonia and other lung infections with limited dataset and long training hours is a serious problem to cater. Limited amount of data often results in over-fitting models and due to this reason, model does not predict generalized results. To fill this gap, we proposed GAN-based approach to synthesize images which later fed into the deep learning models to classify images of COVID-19, Normal, and Viral Pneumonia. Specifically, customized Wasserstein GAN is proposed to generate 19% more Chest X-ray images as compare to the real images. This expanded dataset is then used to train four proposed deep learning models: VGG-16, ResNet-50, GoogLeNet and MNAST. The result showed that expanded dataset utilized deep learning models to deliver high classification accuracies. In particular, VGG-16 achieved highest accuracy of 99.17% among all four proposed schemes. Rest of the models like ResNet-50, GoogLeNet and MNAST delivered 93.9%, 94.49% and 97.75% testing accuracies respectively. Later, the efficiency of these models is compared with the state of art models on the basis of accuracy. Further, our proposed models can be applied to address the issue of scant datasets for any problem of image analysis.

翻译：COVID-19具有极强的传染性，其快速蔓延促使人们关注其早期诊断。COVID-19的早期诊断能使医疗专业人员和政府机构阻断传播链并压平疫情曲线。随着发达国家病例数量的加速增长，COVID-19引发的病毒性肺炎病例构成了重大挑战。在数据集有限且训练时间长的条件下，COVID-19病例与病毒性肺炎及其他肺部感染的影像重叠是一个亟待解决的严重问题。数据量不足常导致模型过拟合，并因此使模型无法预测泛化结果。为填补这一空白，我们提出了一种基于GAN的方法来合成图像，随后将其输入深度学习模型以对COVID-19、正常及病毒性肺炎的图像进行分类。具体而言，我们提出了定制的Wasserstein GAN，其生成的胸部X光图像数量比真实图像多19%。该扩展数据集随后被用于训练四种提出的深度学习模型：VGG-16、ResNet-50、GoogLeNet和MNAST。结果表明，扩展数据集使深度学习模型实现了高分类准确率。其中，VGG-16在所有四种方案中取得了最高的99.17%准确率。其余模型如ResNet-50、GoogLeNet和MNAST分别实现了93.9%、94.49%和97.75%的测试准确率。随后，这些模型的效率在准确率基础上与前沿模型进行了比较。此外，我们提出的模型可应用于解决任何图像分析问题中数据集稀缺的难题。