迈向可解释的皮肤癌分类：一种融合病灶分割与临床元数据的双网络注意力模型 (Towards Explainable Skin Cancer Classification: A Dual-Network Attention Model with Lesion Segmentation and Clinical Metadata Fusion)

Skin cancer is a life-threatening disease where early detection significantly improves patient outcomes. Automated diagnosis from dermoscopic images is challenging due to high intra-class variability and subtle inter-class differences. Many deep learning models operate as "black boxes," limiting clinical trust. In this work, we propose a dual-encoder attention-based framework that leverages both segmented lesions and clinical metadata to enhance skin lesion classification in terms of both accuracy and interpretability. A novel Deep-UNet architecture with Dual Attention Gates (DAG) and Atrous Spatial Pyramid Pooling (ASPP) is first employed to segment lesions. The classification stage uses two DenseNet201 encoders-one on the original image and another on the segmented lesion whose features are fused via multi-head cross-attention. This dual-input design guides the model to focus on salient pathological regions. In addition, a transformer-based module incorporates patient metadata (age, sex, lesion site) into the prediction. We evaluate our approach on the HAM10000 dataset and the ISIC 2018 and 2019 challenges. The proposed method achieves state-of-the-art segmentation performance and significantly improves classification accuracy and average AUC compared to baseline models. To validate our model's reliability, we use Gradient-weighted Class Activation Mapping (Grad-CAM) to generate heatmaps. These visualizations confirm that our model's predictions are based on the lesion area, unlike models that rely on spurious background features. These results demonstrate that integrating precise lesion segmentation and clinical data with attention-based fusion leads to a more accurate and interpretable skin cancer classification model.

翻译：皮肤癌是一种危及生命的疾病，早期检测可显著改善患者预后。由于类内差异大且类间差异细微，基于皮肤镜图像的自动诊断具有挑战性。许多深度学习模型作为"黑箱"运行，限制了临床信任度。本研究提出一种基于双编码器注意力的框架，利用分割后的病灶和临床元数据，在准确性和可解释性两方面提升皮肤病变分类性能。首先采用一种融合双重注意力门控（DAG）与空洞空间金字塔池化（ASPP）的新型Deep-UNet架构进行病灶分割。分类阶段使用两个DenseNet201编码器——一个处理原始图像，另一个处理分割病灶，通过多头交叉注意力机制融合特征。这种双输入设计引导模型聚焦于显著病理区域。此外，基于Transformer的模块将患者元数据（年龄、性别、病灶部位）整合到预测中。我们在HAM10000数据集及ISIC 2018和2019挑战赛上评估所提方法。该方法实现了最先进的分割性能，与基线模型相比显著提升了分类准确率和平均AUC。为验证模型可靠性，我们采用梯度加权类激活映射（Grad-CAM）生成热力图。可视化结果证实，与依赖虚假背景特征的模型不同，本模型的预测基于病灶区域。这些结果表明，将精确的病灶分割和临床数据与基于注意力的融合相结合，可构建更准确且可解释的皮肤癌分类模型。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日