We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is automatically synthesized from simple primitive shapes with random texturing and augmentations (e.g., height fields, boolean differences, and wireframes). Unlike previous 3D datasets (e.g., Objaverse) which are often captured or crafted by humans to approximate real 3D data, Zeroverse completely ignores realistic global semantics but is rich in complex geometric and texture details that are locally similar to or even more intricate than real objects. We demonstrate that our LRM-Zero, trained with our fully synthesized Zeroverse, can achieve high visual quality in the reconstruction of real-world objects, competitive with models trained on Objaverse. We also analyze several critical design choices of Zeroverse that contribute to LRM-Zero's capability and training stability. Our work demonstrates that 3D reconstruction, one of the core tasks in 3D vision, can potentially be addressed without the semantics of real-world objects. The Zeroverse's procedural synthesis code and interactive visualization are available at: https://desaixie.github.io/lrm-zero/.
翻译:我们提出了LRM-Zero,这是一个完全在合成三维数据上训练的大型重建模型(LRM),能够实现高质量的稀疏视图三维重建。LRM-Zero的核心是我们程序化生成的三维数据集Zeroverse,该数据集由简单的原始形状通过随机纹理和增强(例如高度场、布尔差集和线框)自动合成。与以往通常通过人工捕获或制作以近似真实三维数据的数据集(例如Objaverse)不同,Zeroverse完全忽略了真实的全局语义,但富含复杂的几何和纹理细节,这些细节在局部上与真实物体相似甚至更为精细。我们证明,使用我们完全合成的Zeroverse训练的LRM-Zero,在重建真实世界物体时能够达到很高的视觉质量,其性能可与在Objaverse上训练的模型相媲美。我们还分析了Zeroverse中几个对LRM-Zero能力和训练稳定性至关重要的设计选择。我们的工作表明,三维视觉的核心任务之一——三维重建,有可能在不依赖真实世界物体语义的情况下得到解决。Zeroverse的程序合成代码和交互式可视化可在以下网址获取:https://desaixie.github.io/lrm-zero/。