Image augmentations are quintessential for effective visual representation learning across self-supervised learning techniques. While augmentation strategies for natural imaging have been studied extensively, medical images are vastly different from their natural counterparts. Thus, it is unknown whether common augmentation strategies employed in Siamese representation learning generalize to medical images and to what extent. To address this challenge, in this study, we systematically assess the effect of various augmentations on the quality and robustness of the learned representations. We train and evaluate Siamese Networks for abnormality detection on chest X-Rays across three large datasets (MIMIC-CXR, CheXpert and VinDR-CXR). We investigate the efficacy of the learned representations through experiments involving linear probing, fine-tuning, zero-shot transfer, and data efficiency. Finally, we identify a set of augmentations that yield robust representations that generalize well to both out-of-distribution data and diseases, while outperforming supervised baselines using just zero-shot transfer and linear probes by up to 20%. Our code is available at https://github.com/StanfordMIMI/siaug.
翻译:图像增强是自监督学习技术中实现有效视觉表示学习的关键要素。尽管自然图像的增强策略已得到广泛研究,但医学图像与自然图像存在显著差异。因此,目前尚不清楚孪生表示学习中常用的增强策略能否推广至医学图像及其推广程度。为解决这一问题,本研究系统评估了多种增强方法对所学表示质量与鲁棒性的影响。我们基于三个大型数据集(MIMIC-CXR、CheXpert和VinDR-CXR),训练并评估了用于胸部X光片异常检测的孪生网络。通过线性探针、微调、零样本迁移及数据效率实验,验证了所学表示的有效性。最终,我们确定了一组增强方法,这些方法能够产生对分布外数据和疾病均具有良好泛化性的鲁棒表示,且仅通过零样本迁移和线性探针就比监督基线方法性能提升高达20%。我们的代码已开源,详见https://github.com/StanfordMIMI/siaug。