Language models now provide an interface to express and often solve general problems in natural language, yet their ultimate computational capabilities remain a major topic of scientific debate. Unlike a formal computer, a language model is trained to autoregressively predict successive elements in human-generated text. We prove that chaining a language model's autoregressive output is sufficient to perform universal computation. That is, a language model can simulate the execution of any algorithm on any input. The challenge of eliciting desired computational behaviour can thus be reframed in terms of programmability: the ease of finding a suitable prompt. Strikingly, we demonstrate that even randomly initialized language models are capable of universal computation before training. This implies that training does not give rise to computational expressiveness -- rather, it improves programmability, enabling a natural language interface for accessing these intrinsic capabilities.
翻译:语言模型如今为表达并常解决自然语言中的一般问题提供了接口,但其最终计算能力仍是科学争论的主要议题。与形式化计算机不同,语言模型通过自回归方式预测人类生成文本中的连续元素进行训练。我们证明,链式调用语言模型的自回归输出足以实现通用计算。换言之,语言模型能够模拟任意算法在任意输入上的执行过程。因此,激发特定计算行为的挑战可被重新定义为可编程性问题:即寻找合适提示的难易程度。值得注意的是,我们证明即使是随机初始化的语言模型在训练前也具备通用计算能力。这意味着训练并不会产生计算表达能力——相反,它提升了可编程性,使得自然语言接口能够调用这些内在能力。