Reproducible closed-loop evaluation remains a major bottleneck in Embodied AI such as visual navigation. A promising path forward is high-fidelity simulation that combines photorealistic sensor rendering with geometrically grounded interaction in complex, open-world urban environments. Although recent video-3DGS methods ease open-world scene capturing, they are still unsuitable for benchmarking due to large visual and geometric sim-to-real gaps. To address these challenges, we introduce Wanderland, a real-to-sim framework that features multi-sensor capture, reliable reconstruction, accurate geometry, and robust view synthesis. Using this pipeline, we curate a diverse dataset of indoor-outdoor urban scenes and systematically demonstrate how image-only pipelines scale poorly, how geometry quality impacts novel view synthesis, and how all of these adversely affect navigation policy learning and evaluation reliability. Beyond serving as a trusted testbed for embodied navigation, Wanderland's rich raw sensor data further allows benchmarking of 3D reconstruction and novel view synthesis models. Our work establishes a new foundation for reproducible research in open-world embodied AI. Project website is at https://ai4ce.github.io/wanderland/.
翻译:可复现的闭环评估仍是具身AI(如视觉导航)领域的主要瓶颈。一个富有前景的前进方向是结合高保真度传感器渲染与复杂开放世界城市场景中几何基础交互的仿真。尽管近期视频-3D高斯泼溅方法简化了开放世界场景捕捉,但由于存在显著的视觉与几何仿真-真实差距,它们仍不适用于基准测试。为应对这些挑战,我们提出沃德兰——一个具备多传感器采集、可靠重建、精确几何与鲁棒视图合成能力的真实到仿真框架。利用该流水线,我们策划了一个包含室内外城市场景的多样化数据集,系统性地论证了纯图像流水线的扩展性不足、几何质量对新视图合成的影响,以及这些因素如何对导航策略学习与评估可靠性产生不利影响。除了作为具身导航的可信测试平台外,沃德兰的丰富原始传感器数据还可用于3D重建与新视图合成模型的基准测试。我们的工作为开放世界具身AI的可复现研究奠定了新基础。项目网站:https://ai4ce.github.io/wanderland/。