Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities that surpass the abilities of the experts generating its data. We demonstrate transcendence by training an autoregressive transformer to play chess from game transcripts, and show that the trained model can sometimes achieve better performance than all players in the dataset. We theoretically prove that transcendence is enabled by low-temperature sampling, and rigorously assess this experimentally. Finally, we discuss other sources of transcendence, laying the groundwork for future investigation of this phenomenon in a broader setting.
翻译:生成模型的训练目标简单直接,即模仿其训练数据所诱导的条件概率分布。因此,当模型在人类生成的数据上进行训练时,我们可能不会期望该人工智能模型在人类的原始目标上超越人类。在本工作中,我们研究了一种称为"超越性"的现象:即生成模型获得的能力超越了生成其数据的专家所具备的能力。我们通过训练一个自回归Transformer模型从棋局记录中学习下棋,证明了超越性的存在,并表明训练后的模型有时能够取得比数据集中所有棋手都更好的表现。我们从理论上证明了低温采样能够促成超越性,并通过实验对此进行了严格评估。最后,我们探讨了超越性的其他来源,为未来在更广泛背景下研究这一现象奠定了基础。