A lot of recent machine learning research papers have ``open-ended learning'' in their title. But very few of them attempt to define what they mean when using the term. Even worse, when looking more closely there seems to be no consensus on what distinguishes open-ended learning from related concepts such as continual learning, lifelong learning or autotelic learning. In this paper, we contribute to fixing this situation. After illustrating the genealogy of the concept and more recent perspectives about what it truly means, we outline that open-ended learning is generally conceived as a composite notion encompassing a set of diverse properties. In contrast with previous approaches, we propose to isolate a key elementary property of open-ended processes, which is to produce elements from time to time (e.g., observations, options, reward functions, and goals), over an infinite horizon, that are considered novel from an observer's perspective. From there, we build the notion of open-ended learning problems and focus in particular on the subset of open-ended goal-conditioned reinforcement learning problems in which agents can learn a growing repertoire of goal-driven skills. Finally, we highlight the work that remains to be performed to fill the gap between our elementary definition and the more involved notions of open-ended learning that developmental AI researchers may have in mind.
翻译:近期大量机器学习研究论文标题中包含“开放式学习”,但鲜有论文试图界定该术语的具体含义。更甚者,深入观察后发现,学界对开放式学习与持续学习、终身学习或自驱学习等相关概念的区分标准尚未达成共识。本文旨在改善这一现状。通过阐释该概念的发展脉络及近期对其本质的理解,我们指出开放式学习通常被视为包含多种异质性属性的复合概念。不同于以往研究,我们提出提炼开放式过程中的关键基础属性——在无限时间范围内,能持续产生被观察者视为新颖的要素(如观测值、选项、奖励函数及目标)。以此为基础,我们构建了开放式学习问题的概念体系,并特别聚焦于目标条件强化学习问题的子集——智能体可在此类问题中逐步习得不断增长的、以目标驱动的技能库。最后,我们指出为弥合本基础定义与发展型AI研究者所构想的更复杂开放式学习概念之间差距尚需开展的研究工作。