Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities that surpass the abilities of the experts generating its data. We demonstrate transcendence by training an autoregressive transformer to play chess from game transcripts, and show that the trained model can sometimes achieve better performance than all players in the dataset. We theoretically prove that transcendence can be enabled by low-temperature sampling, and rigorously assess this claim experimentally. Finally, we discuss other sources of transcendence, laying the groundwork for future investigation of this phenomenon in a broader setting.
翻译:生成模型的训练目标简单直接,即模仿其训练数据所诱导的条件概率分布。因此,当使用人类生成的数据进行训练时,我们可能不会期望人工智能模型在人类原始目标上超越人类。在本研究中,我们探讨了"超越性"现象:即生成模型获得超越其数据生成专家能力的情况。我们通过训练一个自回归Transformer模型来从棋局记录中学习下棋,从而证明了超越性,并表明训练后的模型有时能够取得比数据集中所有棋手都更好的表现。我们从理论上证明了低温采样能够促成超越性,并通过实验严格验证了这一论断。最后,我们讨论了超越性的其他来源,为未来在更广泛背景下研究这一现象奠定了基础。