Skin cancer is one of the most prevalent forms of human cancer. It is recognized mainly visually, beginning with clinical screening and continuing with the dermoscopic examination, histological assessment, and specimen collection. Deep convolutional neural networks (CNNs) perform highly segregated and potentially universal tasks against a classified finegrained object. This research proposes a novel multi-class prediction framework that classifies skin lesions based on ViT and ViTGAN. Vision transformers-based GANs (Generative Adversarial Networks) are utilized to tackle the class imbalance. The framework consists of four main phases: ViTGANs, Image processing, and explainable AI. Phase 1 consists of generating synthetic images to balance all the classes in the dataset. Phase 2 consists of applying different data augmentation techniques and morphological operations to increase the size of the data. Phases 3 & 4 involve developing a ViT model for edge computing systems that can identify patterns and categorize skin lesions from the user's skin visible in the image. In phase 3, after classifying the lesions into the desired class with ViT, we will use explainable AI (XAI) that leads to more explainable results (using activation maps, etc.) while ensuring high predictive accuracy. Real-time images of skin diseases can capture by a doctor or a patient using the camera of a mobile application to perform an early examination and determine the cause of the skin lesion. The whole framework is compared with the existing frameworks for skin lesion detection.
翻译:皮肤癌是最常见的人类癌症形式之一。其识别主要依赖视觉手段,从临床筛查开始,继而进行皮肤镜检测、组织学评估和标本采集。深度卷积神经网络(CNN)能够针对细粒度分类目标执行高度分离且具有通用性的任务。本研究提出了一种新颖的多类别预测框架,基于ViT和ViTGAN对皮肤病变进行分类。采用基于视觉Transformer的生成对抗网络(GAN)来解决类别不平衡问题。该框架包含四个主要阶段:ViTGAN、图像处理和可解释人工智能。第一阶段通过生成合成图像来平衡数据集中所有类别。第二阶段应用不同的数据增强技术和形态学操作以增加数据量。第三、四阶段涉及开发适用于边缘计算系统的ViT模型,该模型能够从用户图像中的可见皮肤区域识别模式并对皮肤病变进行分类。在第三阶段,使用ViT将病变分类为目标类别后,我们将采用可解释人工智能(XAI)在确保高预测准确性的同时,通过激活映射等手段获得更具可解释性的结果。医生或患者可使用移动应用摄像头实时采集皮肤疾病图像,从而进行早期检查并确定皮肤病变的病因。整个框架与现有皮肤病变检测框架进行了对比评估。