Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

Zheng Zhu,Xiaofeng Wang,Wangbo Zhao,Chen Min,Nianchen Deng,Min Dou,Yuqi Wang,Botian Shi,Kai Wang,Chi Zhang,Yang You,Zhaoxiang Zhang,Dawei Zhao,Liang Xiao,Jian Zhao,Jiwen Lu,Guan Huang

from arxiv, This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey

General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical laws. In this survey, we embark on a comprehensive exploration of the latest advancements in world models. Our analysis navigates through the forefront of generative methodologies in video generation, where world models stand as pivotal constructs facilitating the synthesis of highly realistic visual content. Additionally, we scrutinize the burgeoning field of autonomous-driving world models, meticulously delineating their indispensable role in reshaping transportation and urban mobility. Furthermore, we delve into the intricacies inherent in world models deployed within autonomous agents, shedding light on their profound significance in enabling intelligent interactions within dynamic environmental contexts. At last, we examine challenges and limitations of world models, and discuss their potential future directions. We hope this survey can serve as a foundational reference for the research community and inspire continued innovation. This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey.

翻译：通用世界模型是实现通用人工智能（AGI）的关键路径，作为从虚拟环境到决策系统等多种应用的基础。近期，Sora模型的出现因其卓越的模拟能力而备受关注，展现出对物理规律的初步理解。在本综述中，我们全面探索了世界模型的最新进展。我们的分析贯穿视频生成领域的前沿生成方法，其中世界模型作为关键构建模块，促进了高度逼真视觉内容的合成。此外，我们审视了新兴的自动驾驶世界模型领域，细致描绘了其在重塑交通与城市出行中不可或缺的作用。进一步地，我们深入探讨部署于自主智能体中的世界模型所固有的复杂性，揭示了其在动态环境背景下实现智能交互的深远意义。最后，我们考察了世界模型面临的挑战与局限，并讨论了其潜在未来方向。我们希望本综述能够为研究社区提供基础参考，并激发持续创新。本综述将在以下链接定期更新：https://github.com/GigaAI-research/General-World-Models-Survey。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日