The integration of neural-network-based systems into clinical practice is limited by challenges related to domain generalization and robustness. The computer vision community established benchmarks such as ImageNet-C as a fundamental prerequisite to measure progress towards those challenges. Similar datasets are largely absent in the medical imaging community which lacks a comprehensive benchmark that spans across imaging modalities and applications. To address this gap, we create and open-source MedMNIST-C, a benchmark dataset based on the MedMNIST+ collection covering 12 datasets and 9 imaging modalities. We simulate task and modality-specific image corruptions of varying severity to comprehensively evaluate the robustness of established algorithms against real-world artifacts and distribution shifts. We further provide quantitative evidence that our simple-to-use artificial corruptions allow for highly performant, lightweight data augmentation to enhance model robustness. Unlike traditional, generic augmentation strategies, our approach leverages domain knowledge, exhibiting significantly higher robustness when compared to widely adopted methods. By introducing MedMNIST-C and open-sourcing the corresponding library allowing for targeted data augmentations, we contribute to the development of increasingly robust methods tailored to the challenges of medical imaging. The code is available at https://github.com/francescodisalvo05/medmnistc-api}{github.com/francescodisalvo05/medmnistc-api .
翻译:神经网络系统融入临床实践受到领域泛化与鲁棒性相关挑战的限制。计算机视觉社区建立了如ImageNet-C等基准,作为衡量应对这些挑战进展的基础前提。医学影像社区则普遍缺乏类似数据集,缺少跨成像模态与应用的综合基准。为填补这一空白,我们创建并开源了MedMNIST-C——一个基于MedMNIST+集合的基准数据集,涵盖12个数据集与9种成像模态。我们模拟了不同严重程度的任务与模态特异性图像损坏,以全面评估现有算法对真实世界伪影与分布偏移的鲁棒性。我们进一步提供定量证据,表明我们易于使用的人工损坏方法能够实现高性能、轻量级的数据增强,从而提升模型鲁棒性。与传统的通用增强策略不同,我们的方法利用领域知识,在鲁棒性上显著优于广泛采用的方法。通过引入MedMNIST-C并开源支持针对性数据增强的相应库,我们为开发更适应医学影像挑战的鲁棒方法做出贡献。代码发布于https://github.com/francescodisalvo05/medmnistc-api。