Skin cancer is a major concern to public health, accounting for one-third of the reported cancers. If not detected early, the cancer has the potential for severe consequences. Recognizing the critical need for effective skin cancer classification, we address the limitations of existing models, which are often too large to deploy in areas with limited computational resources. In response, we present a knowledge distillation based approach for creating a lightweight yet high-performing classifier. The proposed solution involves fusing three models, namely ResNet152V2, ConvNeXtBase, and ViT Base, to create an effective teacher model. The teacher model is then employed to guide a lightweight student model of size 2.03 MB. This student model is further compressed to 469.77 KB using 16-bit quantization, enabling smooth incorporation into edge devices. With six-stage image preprocessing, data augmentation, and a rigorous ablation study, the model achieves an impressive accuracy of 98.75% on the HAM10000 dataset and 98.94% on the Kaggle dataset in classifying benign and malignant skin cancers. With its high accuracy and compact size, our model appears to be a potential choice for accurate skin cancer classification, particularly in resource-constrained settings.
翻译:皮肤癌是公共卫生领域的重大关切,占已报告癌症病例的三分之一。若未能及早发现,该疾病可能导致严重后果。认识到高效皮肤癌分类的迫切需求,我们针对现有模型通常因体积过大而难以在计算资源有限区域部署的局限性,提出一种基于知识蒸馏的轻量化高性能分类器构建方法。该方案通过融合ResNet152V2、ConvNeXtBase与ViT Base三种模型构建高效教师模型,进而指导尺寸仅为2.03 MB的轻量化学生模型。该学生模型经16位量化进一步压缩至469.77 KB,可顺畅集成于边缘设备。通过六阶段图像预处理、数据增强及严谨的消融实验,该模型在HAM10000数据集上实现98.75%的良恶性皮肤癌分类准确率,在Kaggle数据集上达到98.94%。凭借高精度与紧凑尺寸,本模型有望成为精准皮肤癌分类(特别是在资源受限环境中)的潜在优选方案。