Transformer-based interpretable multi-modal data fusion for skin lesion classification

A lot of deep learning (DL) research these days is mainly focused on improving quantitative metrics regardless of other factors. In human-centered applications, like skin lesion classification in dermatology, DL-driven clinical decision support systems are still in their infancy due to the limited transparency of their decision-making process. Moreover, the lack of procedures that can explain the behavior of trained DL algorithms leads to almost no trust from clinical physicians. To diagnose skin lesions, dermatologists rely on visual assessment of the disease and the data gathered from the patient's anamnesis. Data-driven algorithms dealing with multi-modal data are limited by the separation of feature-level and decision-level fusion procedures required by convolutional architectures. To address this issue, we enable single-stage multi-modal data fusion via the attention mechanism of transformer-based architectures to aid in diagnosing skin diseases. Our method beats other state-of-the-art single- and multi-modal DL architectures in image-rich and patient-data-rich environments. Additionally, the choice of the architecture enables native interpretability support for the classification task both in the image and metadata domain with no additional modifications necessary.

翻译：当前大量深度学习研究主要致力于提升定量指标，而忽视了其他因素。在以人为本的应用场景中，如皮肤科领域的皮肤病变分类，由于深度学习驱动的临床决策支持系统决策过程透明度有限，此类系统仍处于发展初期。此外，缺乏能够解释已训练深度学习算法行为的方法，导致临床医生对其几乎不信任。皮肤科医生诊断皮肤病变时，需依赖对疾病的视觉评估及患者病史数据。处理多模态数据的深度学习算法受限于卷积架构所需的特征级与决策级融合流程的分离性。为解决该问题，我们通过基于Transformer架构的注意力机制实现单阶段多模态数据融合，以辅助皮肤病诊断。在图像丰富及患者数据丰富的环境中，本方法优于其他最先进的单模态与多模态深度学习架构。此外，所选架构原生支持分类任务在图像及元数据领域的可解释性，无需额外修改。

相关内容

DSS

关注 479

决策支持系统（Decision Support Systems）期刊中发表的文章的共同主线是它们与支持增强决策制定的理论和技术问题的相关性。所涉及的领域可能包括基础、功能、接口、实现、影响和决策支持系统(DSS)的评估。手稿可以从不同的方法和方法学中获得，包括决策理论、经济学、计量经济学、统计学、计算机支持的协作工作、数据库管理、语言学、管理科学、数学建模、运营管理、认知科学、心理学、用户界面管理等。但是，一份侧重于对任何这些相关领域的直接贡献的手稿应提交给适合于特定领域的机构。官网地址：http://dblp.uni-trier.de/db/journals/dss/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日