Artificial intelligence is making spectacular progress, and one of the best examples is the development of large language models (LLMs) such as OpenAI's GPT series. In these lectures, written for readers with a background in mathematics or physics, we give a brief history and survey of the state of the art, and describe the underlying transformer architecture in detail. We then explore some current ideas on how LLMs work and how models trained to predict the next word in a text are able to perform other tasks displaying intelligence.
翻译:人工智能正取得引人瞩目的进展,其中最具代表性的例子之一便是大型语言模型(LLMs)的发展,例如OpenAI的GPT系列。在本系列讲座中,我们面向具备数学或物理学背景的读者,简要回顾了发展历程与当前技术现状,并详细阐述了底层的Transformer架构。随后,我们探讨了关于LLMs工作原理的一些当代观点,以及那些旨在预测文本中下一个词的模型何以能够执行其他展现智能的任务。