Deriving formal bounds on the expressivity of transformers, as well as studying transformers that are constructed to implement known algorithms, are both effective methods for better understanding the computational power of transformers. Towards both ends, we introduce the temporal counting logic $\textsf{K}_\text{t}$[#] alongside the RASP variant $\textsf{C-RASP}$. We show they are equivalent to each other, and that together they are the best-known lower bound on the formal expressivity of future-masked soft attention transformers with unbounded input size. We prove this by showing all $\textsf{K}_\text{t}$[#] formulas can be compiled into these transformers.
翻译:推导Transformer表达能力的正式界限,以及研究为实现已知算法而构建的Transformer,均是深入理解Transformer计算能力的有效方法。为达成这两个目标,我们引入了时序计数逻辑$\textsf{K}_\text{t}$[#]及其对应的RASP变体$\textsf{C-RASP}$。我们证明两者相互等价,并且共同构成了未来掩码soft attention Transformer在输入无界情况下形式表达能力的最佳已知下界。我们通过展示所有$\textsf{K}_\text{t}$[#]公式均可编译为此类Transformer来证明这一结论。