Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures like Transformers and RNNs (linear and non-linear). However, these are often sequence-to-sequence tasks: learning to map actions (permutations) to states, which is incompatible with the next-token prediction setting commonly used to train language models. We address this gap by converting permutation composition into code via REPL traces that interleave state-reveals through prints and variable transformations. We show that linear RNNs capable of state-tracking excel also in this setting, while Transformers still fail. Motivated by this representation, we investigate why tracking states in code is generally difficult: actions are not always fully observable. We frame this as tracking the state of a probabilistic finite-state automaton with deterministic state reveals and show that linear RNNs can be worse than non-linear RNNs at tracking states in this setup.
翻译:近年来,状态追踪任务,特别是置换组合任务,已成为理解Transformer和RNN(线性和非线性)等序列模型架构能力边界的测试平台。然而,这些通常是序列到序列任务:学习将动作(置换)映射到状态,这与通常用于训练语言模型的下一词预测设置不兼容。我们通过将置换组合转换为代码来解决这一差距,具体方式是利用REPL跟踪记录,通过打印语句和变量变换交错揭示状态。我们证明,能够进行状态追踪的线性RNN在此设置下同样表现出色,而Transformer仍然失败。受此表示方法的启发,我们探究了在代码中追踪状态通常困难的原因:动作并非总是完全可观测的。我们将此问题形式化为跟踪具有确定性状态揭示的概率有限状态自动机的状态,并证明在此设置下,线性RNN在状态追踪方面可能劣于非线性RNN。