Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers

Vision transformers (ViTs) can be trained using various learning paradigms, from fully supervised to self-supervised. Diverse training protocols often result in significantly different feature spaces, which are usually compared through alignment analysis. However, current alignment measures quantify this relationship in terms of a single scalar value, obscuring the distinctions between common and unique features in pairs of representations that share the same scalar alignment. We address this limitation by combining alignment analysis with concept discovery, which enables a breakdown of alignment into single concepts encoded in feature space. This fine-grained comparison reveals both universal and unique concepts across different representations, as well as the internal structure of concepts within each of them. Our methodological contributions address two key prerequisites for concept-based alignment: 1) For a description of the representation in terms of concepts that faithfully capture the geometry of the feature space, we define concepts as the most general structure they can possibly form - arbitrary manifolds, allowing hidden features to be described by their proximity to these manifolds. 2) To measure distances between concept proximity scores of two representations, we use a generalized Rand index and partition it for alignment between pairs of concepts. We confirm the superiority of our novel concept definition for alignment analysis over existing linear baselines in a sanity check. The concept-based alignment analysis of representations from four different ViTs reveals that increased supervision correlates with a reduction in the semantic structure of learned representations.

翻译：视觉Transformer（ViTs）可通过多种学习范式进行训练，从全监督到自监督。不同的训练协议通常会产生显著不同的特征空间，这些特征空间通常通过对齐分析进行比较。然而，当前的对齐度量仅以单一标量值量化这种关系，这掩盖了共享相同标量对齐的表示对中共同特征与独特特征之间的差异。我们通过将对齐分析与概念发现相结合来解决这一局限，从而将对齐分解为特征空间中编码的单个概念。这种细粒度比较揭示了不同表示之间的通用概念和独特概念，以及每个表示内部的概念结构。我们的方法贡献解决了基于概念的对齐的两个关键前提：1）为了用忠实捕捉特征空间几何结构的概念来描述表示，我们将概念定义为它们可能形成的最一般结构——任意流形，允许隐藏特征通过其与这些流形的接近度来描述。2）为了测量两个表示的概念接近度得分之间的距离，我们使用广义兰德指数并将其划分为概念对之间的对齐。在合理性检验中，我们证实了这种新颖的概念定义在对齐分析中优于现有的线性基线。对四种不同ViT表示的概念对齐分析表明，监督程度的增加与学习表示的语义结构简化相关。