Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in learned feature representations. To address this problem, we propose a more realistic, physics-based color data augmentation - which we call Planckian Jitter - that creates realistic variations in chromaticity and produces a model robust to illumination changes that can be commonly observed in real life, while maintaining the ability to discriminate image content based on color information. Experiments confirm that such a representation is complementary to the representations learned with the currently-used color jitter augmentation and that a simple concatenation leads to significant performance gains on a wide range of downstream datasets. In addition, we present a color sensitivity analysis that documents the impact of different training methods on model neurons and shows that the performance of the learned features is robust with respect to illuminant variations.
翻译:近期多项自监督学习研究通过将同一图像的不同增强版本映射至相同特征表示来进行训练。数据增强方法对学习所得特征表示的质量至关重要。本文分析了传统数据增强中使用的颜色抖动对特征表示中色彩特征质量的负面影响。为解决该问题,我们提出一种基于物理模型的更真实的颜色数据增强方法——称为Planckian抖动——该方法可产生符合色度学规律的色彩变化,并生成对真实生活中常见光照变化鲁棒的模型,同时保持基于颜色信息区分图像内容的能力。实验表明,该表示与当前使用的颜色抖动增强所学习的表示具有互补性,简单的特征拼接即可在广泛的下游数据集上获得显著性能提升。此外,我们通过色彩敏感性分析,揭示了不同训练方法对模型神经元的影响,证明学习到的特征对光照变化具有鲁棒性。