Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewer layers than the number of reasoning steps. This raises the question: what solutions are learned by these shallow and non-recurrent models? We find that a low-depth Transformer can represent the computations of any finite-state automaton (thus, any bounded-memory algorithm), by hierarchically reparameterizing its recurrent dynamics. Our theoretical results characterize shortcut solutions, whereby a Transformer with $o(T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$. We find that polynomial-sized $O(\log T)$-depth solutions always exist; furthermore, $O(1)$-depth simulators are surprisingly common, and can be understood using tools from Krohn-Rhodes theory and circuit complexity. Empirically, we perform synthetic experiments by training Transformers to simulate a wide variety of automata, and show that shortcut solutions can be learned via standard training. We further investigate the brittleness of these solutions and propose potential mitigations.
翻译:算法推理需要的能力最自然地是通过循环计算模型(如图灵机)来理解的。然而,Transformer模型虽然缺乏循环机制,却能用远少于推理步数的层数执行此类推理。这引发了一个问题:这些浅层且非循环的模型学到了何种解决方案?我们发现,低层深度的Transformer可以通过层级化地重新参数化其循环动力学,来表示任意有限状态自动机(从而表示任何有界内存算法)的计算。我们的理论结果刻画了捷径解决方案的特征——一个具有$o(T)$层的Transformer能够精确复现输入序列长度为$T$的自动机计算。我们证明,多项式规模的$O(\log T)$层解决方案始终存在;此外,$O(1)$层模拟器出奇地普遍,且可通过克罗恩-罗兹理论和电路复杂性工具来理解。实验方面,我们通过训练Transformer模拟各类自动机进行合成实验,表明捷径解决方案可通过标准训练习得。我们进一步研究了这些解决方案的脆弱性,并提出了潜在的缓解措施。