Class imbalance is a pervasive issue in the field of disease classification from medical images. It is necessary to balance out the class distribution while training a model for decent results. However, in the case of rare medical diseases, images from affected patients are much harder to come by compared to images from non-affected patients, resulting in unwanted class imbalance. Various processes of tackling class imbalance issues have been explored so far, each having its fair share of drawbacks. In this research, we propose an outlier detection based binary medical image classification technique which can handle even the most extreme case of class imbalance. We have utilized a dataset of malaria parasitized and uninfected cells. An autoencoder model titled AnoMalNet is trained with only the uninfected cell images at the beginning and then used to classify both the affected and non-affected cell images by thresholding a loss value. We have achieved an accuracy, precision, recall, and F1 score of 98.49%, 97.07%, 100%, and 98.52% respectively, performing better than large deep learning models and other published works. As our proposed approach can provide competitive results without needing the disease-positive samples during training, it should prove to be useful in binary disease classification on imbalanced datasets.
翻译:类别不平衡是医学图像疾病分类领域的普遍问题。为获得理想结果,训练模型时需要平衡类别分布。然而,在罕见疾病案例中,与未患病患者的图像相比,患病患者的图像获取难度显著更高,导致不期望的类别不平衡问题。目前已有多种解决类别不平衡的方法被探索,但各有其局限性。本研究提出一种基于异常检测的二元医学图像分类技术,可处理最极端的类别不平衡情况。我们采用疟原虫感染与未感染细胞数据集,首先仅使用未感染细胞图像训练名为AnoMalNet的自编码器模型,随后通过阈值化损失值对感染与未感染细胞图像进行分类。该方法分别取得98.49%准确率、97.07%精确率、100%召回率以及98.52%的F1分数,性能优于大型深度学习模型及其他已发表研究。由于所提方法在训练过程中无需疾病阳性样本即可获得具有竞争力的结果,该技术对处理不平衡数据集上的二元疾病分类问题具有重要价值。