An agent trained within a closed system can master any desired capability, as long as the following three conditions hold: (a) it receives sufficiently informative and aligned feedback, (b) its coverage of experience/data is broad enough, and (c) it has sufficient capacity and resource. In this position paper, we justify these conditions, and consider what limitations arise from (a) and (b) in closed systems, when assuming that (c) is not a bottleneck. Considering the special case of agents with matching input and output spaces (namely, language), we argue that such pure recursive self-improvement, dubbed "Socratic learning", can boost performance vastly beyond what is present in its initial data or knowledge, and is only limited by time, as well as gradual misalignment concerns. Furthermore, we propose a constructive framework to implement it, based on the notion of language games.
翻译:在一个封闭系统内训练的智能体能够掌握任何期望的能力,只要满足以下三个条件:(a) 它接收到足够充分且对齐的反馈信息;(b) 其经验/数据覆盖范围足够广泛;(c) 其具备足够的容量与资源。在本立场论文中,我们论证了这些条件的合理性,并考察了在假设(c)不构成瓶颈的情况下,封闭系统中由条件(a)和(b)所引发的局限性。针对输入与输出空间相匹配(即语言)的智能体这一特殊情况,我们认为这种纯粹的递归式自我改进——我们称之为“苏格拉底式学习”——能够将性能提升至远超其初始数据或知识所包含的水平,其唯一限制仅在于时间以及逐渐出现的对齐偏差问题。此外,我们基于语言游戏的概念,提出了一个实现该学习范式的建设性框架。