MedAugment: Universal Automatic Data Augmentation Plug-in for Medical Image Analysis

Data augmentation (DA) has been widely leveraged in computer vision to alleviate the data shortage, whereas the DA in medical image analysis (MIA) faces multiple challenges. The prevalent DA approaches in MIA encompass conventional DA, synthetic DA, and automatic DA. However, utilizing these approaches poses various challenges such as experience-driven design and intensive computation cost. Here, we propose an efficient and effective automatic DA method termed MedAugment. We propose a pixel augmentation space and spatial augmentation space and exclude the operations that can break medical details and features, such as severe color distortions or structural alterations that can compromise image diagnostic value. Besides, we propose a novel sampling strategy by sampling a limited number of operations from the two spaces. Moreover, we present a hyperparameter mapping relationship to produce a rational augmentation level and make the MedAugment fully controllable using a single hyperparameter. These configurations settle the differences between natural and medical images, such as high sensitivity to certain attributes such as brightness and posterize. Extensive experimental results on four classification and four segmentation datasets demonstrate the superiority of MedAugment. Compared with existing approaches, the proposed MedAugment serves as a more suitable yet general processing pipeline for medical images without producing color distortions or structural alterations and involving negligible computational overhead. We emphasize that our method can serve as a plugin for arbitrary projects without any extra training stage, thereby holding the potential to make a valuable contribution to the medical field, particularly for medical experts without a solid foundation in deep learning. Code is available at https://github.com/NUS-Tim/MedAugment.

翻译：数据增强（DA）在计算机视觉领域已被广泛用于缓解数据短缺问题，而医学图像分析（MIA）中的DA面临多重挑战。MIA中主流的数据增强方法包括传统DA、合成DA和自动DA。然而，应用这些方法存在诸多挑战，例如经验驱动的设计和高昂的计算成本。本文提出一种高效且有效的自动DA方法，称为MedAugment。我们提出了像素增强空间和空间增强空间，并排除了可能破坏医学细节和特征的操作，例如严重的颜色失真或可能损害图像诊断价值的结构改变。此外，我们提出一种新颖的采样策略，从两个空间中采样有限数量的操作。进一步地，我们提出一种超参数映射关系，以产生合理的增强强度，并使MedAugment仅通过单个超参数即可完全控制。这些配置解决了自然图像与医学图像之间的差异，例如对亮度、色调分离等特定属性的高度敏感性。在四个分类数据集和四个分割数据集上的大量实验结果证明了MedAugment的优越性。与现有方法相比，所提出的MedAugment作为一种更合适且通用的医学图像处理流程，不会产生颜色失真或结构改变，且计算开销可忽略不计。我们强调，本方法可作为任意项目的即插即用模块，无需额外训练阶段，因此有望为医学领域，特别是对深度学习基础薄弱的医学专家，做出有价值的贡献。代码发布于https://github.com/NUS-Tim/MedAugment。