As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help clarify the power of transformers relative to other models of computation, their fundamental capabilities and limits, and the impact of architectural choices. Work in this subarea has made considerable progress in recent years. Here, we undertake a comprehensive survey of this work, documenting the diverse assumptions that underlie different results and providing a unified framework for harmonizing seemingly contradictory findings.
翻译:随着Transformer在自然语言处理领域的广泛应用,部分研究者从形式语言角度出发,在理论上探讨了其所能解决与无法解决的问题。此类探索有助于阐明Transformer相较于其他计算模型的能力边界、基本能力与局限性,以及架构设计选择带来的影响。近年来,该方向的研究取得了显著进展。本文对此类工作进行了全面综述,系统梳理了不同研究成果所依托的多样化假设条件,并构建统一框架以调和看似矛盾的发现。