Deep Learning architectures, and in particular Transformers, are conventionally viewed as a composition of layers. These layers are actually often obtained as the sum of two contributions: a residual path that copies the input and the output of a Transformer block. As a consequence, the inner representations (i.e. the input of these blocks) can be interpreted as iterative refinement of a propagated latent representation. Under this lens, many works suggest that the inner space is shared across layers, meaning that tokens can be decoded at early stages. Mechanistic interpretability even goes further by conjecturing that some layers act as refinement layers. Following this path, we propose inference-time inner looping, which prolongs refinement in pretrained off-the-shelf language models by repeatedly re-applying a selected block range. Across multiple benchmarks, inner looping yields modest but consistent accuracy improvements. Analyses of the resulting latent trajectories suggest more stable state evolution and continued semantic refinement. Overall, our results suggest that additional refinement can be obtained through simple test-time looping, extending computation in frozen pretrained models.
翻译:深度学习架构,特别是Transformer,通常被视为层的组合。这些层实际上往往由两部分贡献相加得到:一条残差路径复制输入,另一条输出Transformer模块的结果。因此,内部表示(即这些模块的输入)可被理解为对传播的潜在表示进行迭代精炼的过程。基于此视角,许多研究表明内部空间在层间共享,这意味着词元可在早期阶段被解码。机制可解释性研究更进一步,推测某些层扮演着精炼层的角色。沿此思路,我们提出推理时内循环方法,通过重复应用选定的模块范围来延长现成预训练语言模型的精炼过程。在多个基准测试中,内循环带来了虽有限但稳定的准确率提升。对所得潜在轨迹的分析表明,该方法能产生更稳定的状态演化与持续的语义精炼。总体而言,我们的研究结果表明,通过简单的测试时循环操作,可在冻结的预训练模型中扩展计算过程以获得额外的精炼效果。