General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical laws. In this survey, we embark on a comprehensive exploration of the latest advancements in world models. Our analysis navigates through the forefront of generative methodologies in video generation, where world models stand as pivotal constructs facilitating the synthesis of highly realistic visual content. Additionally, we scrutinize the burgeoning field of autonomous-driving world models, meticulously delineating their indispensable role in reshaping transportation and urban mobility. Furthermore, we delve into the intricacies inherent in world models deployed within autonomous agents, shedding light on their profound significance in enabling intelligent interactions within dynamic environmental contexts. At last, we examine challenges and limitations of world models, and discuss their potential future directions. We hope this survey can serve as a foundational reference for the research community and inspire continued innovation. This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey.
翻译:通用世界模型是实现通用人工智能(AGI)的关键路径,是虚拟环境到决策系统等多种应用的基石。近期,Sora模型因其卓越的模拟能力而受到广泛关注,展现出对物理规律的初步理解。本综述全面探讨了世界模型的最新进展。我们分析了视频生成领域生成方法的前沿动态,其中世界模型作为关键架构,推动了高真实感视觉内容的合成。同时,我们审视了自动驾驶世界模型这一新兴领域,详细阐述了其在重塑交通与城市出行中不可或缺的作用。此外,我们深入探究了自主智能体中部署的世界模型的内在复杂性,揭示了其在动态环境背景下实现智能交互的重要意义。最后,我们剖析了世界模型面临的挑战与局限,并探讨了其未来潜在发展方向。我们希望本综述能为研究界提供基础性参考,并激发持续创新。本综述将持续更新于:https://github.com/GigaAI-research/General-World-Models-Survey。