Since the success of GPT, large language models (LLMs) have been revolutionizing machine learning and have initiated the so-called LLM prompting paradigm. In the era of LLMs, people train a single general-purpose LLM and provide the LLM with different prompts to perform different tasks. However, such empirical success largely lacks theoretical understanding. Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. Furthermore, we show that even though we use only a single finite-size Transformer, it can still achieve nearly the same complexity bounds as that of the class of all unbounded-size Transformers. Overall, our result reveals that prompting can enable a single finite-size Transformer to be efficiently universal, which establishes a theoretical underpinning for prompt engineering in practice.
翻译:自GPT成功以来,大型语言模型(LLMs)一直在革新机器学习,并开启了所谓的LLM提示范式。在LLM时代,人们训练一个通用的LLM,并通过提供不同的提示来执行不同的任务。然而,这种经验上的成功在很大程度上缺乏理论理解。据我们所知,本文首次对LLM提示范式进行了理论研究。在这项工作中,我们证明了提示实际上是图灵完备的:存在一个有限大小的Transformer,使得对于任何可计算函数,都存在一个相应的提示,遵循该提示时,该Transformer能够计算该函数。此外,我们证明,尽管我们仅使用一个有限大小的Transformer,它仍然能够达到与所有无界大小Transformer类几乎相同的复杂度界限。总体而言,我们的结果表明,提示能够使一个有限大小的Transformer高效地实现通用性,这为实践中的提示工程奠定了理论基础。