Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts

What distinguishes robust models from non-robust ones? While for ImageNet distribution shifts it has been shown that such differences in robustness can be traced back predominantly to differences in training data, so far it is not known what that translates to in terms of what the model has learned. In this work, we bridge this gap by probing the representation spaces of 16 robust zero-shot CLIP vision encoders with various backbones (ResNets and ViTs) and pretraining sets (OpenAI, LAION-400M, LAION-2B, YFCC15M, CC12M and {DataComp}), and comparing them to the representation spaces of less robust models with identical backbones, but different (pre)training sets or objectives (CLIP pretraining on ImageNet-Captions, and supervised training or finetuning on ImageNet).Through this analysis, we generate three novel insights. Firstly, we detect the presence of outlier features in robust zero-shot CLIP vision encoders, which to the best of our knowledge is the first time these are observed in non-language and non-transformer models. Secondly, we find the existence of outlier features to be an indication of ImageNet shift robustness in models, since we only find them in robust models in our analysis. Lastly, we also investigate the number of unique encoded concepts in the representation space and find zero-shot CLIP models to encode a higher number of unique concepts in their representation space. However, we do not find this to be an indicator of ImageNet shift robustness and hypothesize that it is rather related to the language supervision. Since the presence of outlier features can be detected without access to any data from shifted datasets, we believe that they could be a useful tool for practitioners to get a feeling for the distribution shift robustness of a pretrained model during deployment.

翻译：鲁棒模型与非鲁棒模型的核心区别何在？尽管已有研究表明，对于ImageNet分布偏移，这种鲁棒性差异主要可追溯至训练数据的差异，但迄今为止，模型究竟学到了何种表征仍不明确。本研究通过分析16个具有不同骨干网络（ResNets和ViTs）与预训练数据集（OpenAI、LAION-400M、LAION-2B、YFCC15M、CC12M及{DataComp}）的鲁棒零样本CLIP视觉编码器的表征空间，并将其与相同骨干网络但采用不同（预）训练数据集或目标函数（基于ImageNet-Captions的CLIP预训练、ImageNet监督训练或微调）的弱鲁棒模型进行对比，填补了这一研究空白。通过系统分析，我们获得了三项新发现：首先，我们在鲁棒零样本CLIP视觉编码器中检测到异常特征的存在，据我们所知，这是首次在非语言模型与非Transformer架构中观测到此类特征；其次，我们发现异常特征的存在可作为模型ImageNet偏移鲁棒性的指示标志，因为在本研究的所有模型中，仅鲁棒模型呈现此特性；最后，我们探究了表征空间中独特编码概念的数量，发现零样本CLIP模型在其表征空间中编码了更丰富的独特概念，但该特性与ImageNet偏移鲁棒性无直接关联，我们推测其更多与语言监督机制相关。由于异常特征的检测无需依赖任何偏移数据集的数据，我们相信该方法可为实践者在部署预训练模型时评估其分布偏移鲁棒性提供有效工具。