In this article we prove that the general transformer neural model undergirding modern large language models (LLMs) is Turing complete under reasonable assumptions. This is the first work to directly address the Turing completeness of the underlying technology employed in GPT-x as past work has focused on the more expressive, full auto-encoder transformer architecture. From this theoretical analysis, we show that the sparsity/compressibility of the word embedding is an important consideration for Turing completeness to hold. We also show that Transformers are are a variant of B machines studied by Hao Wang.
翻译:本文证明,在合理假设下,支撑现代大语言模型(LLMs)的通用Transformer神经模型是图灵完备的。这是首个直接探讨GPT-x系列所采用底层技术图灵完备性的研究,既往工作主要关注表达能力更强的完整自编码器Transformer架构。通过此项理论分析,我们揭示词嵌入的稀疏性/可压缩性是维持图灵完备性的重要考量因素。同时,我们证明Transformer是王浩所研究的B机器的一种变体。