视觉Transformer中的块循环动力学 (Block-Recurrent Dynamics in Vision Transformers)

As Vision Transformers (ViTs) become standard vision backbones, a mechanistic account of their computational phenomenology is essential. Despite architectural cues that hint at dynamical structure, there is no settled framework that interprets Transformer depth as a well-characterized flow. In this work, we introduce the Block-Recurrent Hypothesis (BRH), arguing that trained ViTs admit a block-recurrent depth structure such that the computation of the original $L$ blocks can be accurately rewritten using only $k \ll L$ distinct blocks applied recurrently. Across diverse ViTs, between-layer representational similarity matrices suggest few contiguous phases. To determine whether these phases reflect genuinely reusable computation, we train block-recurrent surrogates of pretrained ViTs: Recurrent Approximations to Phase-structured TransfORmers (Raptor). In small-scale, we demonstrate that stochastic depth and training promote recurrent structure and subsequently correlate with our ability to accurately fit Raptor. We then provide an empirical existence proof for BRH by training a Raptor model to recover $96\%$ of DINOv2 ImageNet-1k linear probe accuracy in only 2 blocks at equivalent runtime. Finally, we leverage our hypothesis to develop a program of Dynamical Interpretability. We find i) directional convergence into class-dependent angular basins with self-correcting trajectories under small perturbations, ii) token-specific dynamics, where cls executes sharp late reorientations while patch tokens exhibit strong late-stage coherence toward their mean direction, and iii) a collapse to low rank updates in late depth, consistent with convergence to low-dimensional attractors. Altogether, we find a compact recurrent program emerges along ViT depth, pointing to a low-complexity normative solution that enables these models to be studied through principled dynamical systems analysis.

翻译：随着视觉Transformer（ViT）成为标准的视觉骨干网络，对其计算现象学进行机制性解释至关重要。尽管架构线索暗示了动力学结构，但目前尚无既定框架能将Transformer的深度解释为一种特征明确的流。本工作中，我们提出了块循环假说（BRH），认为训练后的ViT允许一种块循环深度结构，使得原始$L$个块的计算可以仅使用$k \ll L$个不同块的循环应用来精确重写。在不同ViT中，层间表征相似性矩阵显示出少数连续阶段。为验证这些阶段是否反映真正可复用的计算，我们训练了预训练ViT的块循环替代模型：面向阶段结构化Transformer的循环近似（Raptor）。在小规模实验中，我们证明随机深度和训练促进了循环结构，并随后与我们精确拟合Raptor的能力相关。接着我们通过训练Raptor模型在仅2个块且等效运行时间下恢复DINOv2 ImageNet-1k线性探测准确率的$96\%$，为BRH提供了经验性存在证明。最后，我们利用该假说开发了动力学可解释性研究框架。我们发现：i）在小扰动下，方向性收敛进入类别依赖的角向盆地并具有自校正轨迹；ii）令牌特异性动力学，其中cls令牌执行剧烈的后期重定向，而图像块令牌在后期阶段表现出朝向其平均方向的强相干性；iii）后期深度中更新矩阵坍缩为低秩，这与收敛到低维吸引子一致。总体而言，我们发现沿ViT深度出现了紧凑的循环程序，指向一种低复杂度的规范性解决方案，使得这些模型能够通过原理性的动力学系统分析进行研究。