From Majority to Minority: A Diffusion-based Augmentation for Underrepresented Groups in Skin Lesion Analysis

AI-based diagnoses have demonstrated dermatologist-level performance in classifying skin cancer. However, such systems are prone to under-performing when tested on data from minority groups that lack sufficient representation in the training sets. Although data collection and annotation offer the best means for promoting minority groups, these processes are costly and time-consuming. Prior works have suggested that data from majority groups may serve as a valuable information source to supplement the training of diagnosis tools for minority groups. In this work, we propose an effective diffusion-based augmentation framework that maximizes the use of rich information from majority groups to benefit minority groups. Using groups with different skin types as a case study, our results show that the proposed framework can generate synthetic images that improve diagnostic results for the minority groups, even when there is little or no reference data from these target groups. The practical value of our work is evident in medical imaging analysis, where under-diagnosis persists as a problem for certain groups due to insufficient representation.

翻译：基于人工智能的诊断在皮肤癌分类方面已展现出与皮肤科医生相当的性能。然而，当在训练集中代表性不足的少数群体数据上进行测试时，此类系统往往表现不佳。尽管数据收集与标注为提升少数群体的代表性提供了最佳途径，但这些过程成本高昂且耗时。先前的研究表明，多数群体的数据可作为宝贵的信息来源，以补充针对少数群体的诊断工具训练。在本研究中，我们提出了一种有效的基于扩散的增强框架，该框架最大限度地利用来自多数群体的丰富信息以惠及少数群体。以不同皮肤类型群体作为案例研究，我们的结果表明，即使目标群体的参考数据极少或完全没有，所提出的框架也能生成合成图像，从而改善对少数群体的诊断结果。本研究的实用价值在医学影像分析领域尤为明显，在该领域中，由于代表性不足，某些群体持续面临诊断不足的问题。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日