Class imbalance is a pervasive issue in the field of disease classification from medical images. It is necessary to balance out the class distribution while training a model for decent results. However, in the case of rare medical diseases, images from affected patients are much harder to come by compared to images from non-affected patients, resulting in unwanted class imbalance. Various processes of tackling class imbalance issues have been explored so far, each having its fair share of drawbacks. In this research, we propose an outlier detection based binary medical image classification technique which can handle even the most extreme case of class imbalance. We have utilized a dataset of malaria parasitized and uninfected cells. An autoencoder model titled AnoMalNet is trained with only the uninfected cell images at the beginning and then used to classify both the affected and non-affected cell images by thresholding a loss value. We have achieved an accuracy, precision, recall, and F1 score of 98.49%, 97.07%, 100%, and 98.52% respectively, performing better than large deep learning models and other published works. As our proposed approach can provide competitive results without needing the disease-positive samples during training, it should prove to be useful in binary disease classification on imbalanced datasets.
翻译:类别不平衡是医学图像疾病分类领域普遍存在的问题。在训练模型以获得理想结果时,平衡类别分布是必要的。然而,对于罕见医学疾病而言,受影响患者的图像相比未受影响患者的图像更难获取,导致不必要的类别不平衡。迄今为止,已有多种解决类别不平衡问题的方法被探索,每种方法都有其各自的缺点。在本研究中,我们提出了一种基于异常检测的二元医学图像分类技术,该技术能够处理甚至最极端的类别不平衡情况。我们利用了一个包含疟原虫寄生细胞和未感染细胞的数据集。一个名为AnoMalNet的自编码器模型最初仅使用未感染细胞图像进行训练,随后通过阈值化损失值来对受影响和未受影响的细胞图像进行分类。我们分别达到了98.49%、97.07%、100%和98.52%的准确率、精确率、召回率和F1分数,性能优于大型深度学习模型及其他已发表的工作。由于我们提出的方法在训练期间无需疾病阳性样本即可提供有竞争力的结果,它应在不平衡数据集上的二元疾病分类中证明其有效性。