In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at https://neoverse-4d.github.io
翻译:本文提出NeoVerse,一种通用的4维世界模型,能够实现4维重建、新轨迹视频生成以及丰富的下游应用。我们首先指出当前4维世界建模方法普遍存在的可扩展性局限,这些局限源于昂贵且专业的多视角4维数据或繁琐的训练预处理过程。相比之下,我们的NeoVerse建立在核心设计理念之上,使完整流程能够灵活扩展到多样化的野外单目视频。具体而言,NeoVerse具备免姿态估计的前馈式4维重建、在线单目退化模式模拟等高度协同的技术特性。这些设计使NeoVerse在不同领域均展现出卓越的通用性与泛化能力。同时,NeoVerse在标准重建与生成基准测试中取得了最先进的性能表现。项目页面详见 https://neoverse-4d.github.io