This report of world models distinguishes prior works by the cognitive functions they innovate. Many works claim an almost human-like cognitive capability in their world models. To evaluate these claims requires a proper grounding in first principles from human and machine cognition theory. In moving towards human-like world models we present a conceptual unified framework for world models that fully incorporates all the cognitive functions (i.e., memory, perception, language, reasoning, imagining, motivation, and metacognition) and identify gaps in existing research as a guide for future states of the art. In particular, we find that motivation (especially intrinsic motivation) and metacognition remain drastically under-researched, and we propose concrete directions to address these gaps informed by active inference and global workspace theory. We also introduce epistemic world models, a new category encompassing agent frameworks for scientific discovery that operate over structured knowledge. Our taxonomy, applied to video, embodied, and epistemic world models, suggests research directions where prior taxonomies have not.
翻译:本报告从认知功能创新的角度区分了现有世界模型研究工作。许多研究声称其世界模型具备近似人类水平的认知能力。要评估这些主张,需要从人类和机器认知理论的第一性原理出发建立适当基础。在迈向类人世界模型的过程中,我们提出了一个概念性统一框架,该框架完整整合了所有认知功能(包括记忆、感知、语言、推理、想象、动机和元认知),并识别出现有研究的空白领域,以指导未来前沿发展。我们发现动机(特别是内在动机)和元认知领域仍存在显著研究不足,并基于主动推理和全局工作空间理论提出了填补这些空白的具体方向。我们还引入了认知世界模型这一新类别,涵盖作用于结构化知识、用于科学发现的智能体框架。本文将提出的分类法应用于视频世界模型、具身世界模型和认知世界模型,指出了现有分类法尚未覆盖的研究方向。