Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale realscanned 3D databases. To facilitate the development of 3D perception, reconstruction, and generation in the real world, we propose OmniObject3D, a large vocabulary 3D object dataset with massive high-quality real-scanned 3D objects. OmniObject3D has several appealing properties: 1) Large Vocabulary: It comprises 6,000 scanned objects in 190 daily categories, sharing common classes with popular 2D datasets (e.g., ImageNet and LVIS), benefiting the pursuit of generalizable 3D representations. 2) Rich Annotations: Each 3D object is captured with both 2D and 3D sensors, providing textured meshes, point clouds, multiview rendered images, and multiple real-captured videos. 3) Realistic Scans: The professional scanners support highquality object scans with precise shapes and realistic appearances. With the vast exploration space offered by OmniObject3D, we carefully set up four evaluation tracks: a) robust 3D perception, b) novel-view synthesis, c) neural surface reconstruction, and d) 3D object generation. Extensive studies are performed on these four benchmarks, revealing new observations, challenges, and opportunities for future research in realistic 3D vision.
翻译:近期3D物体建模的进展多依赖合成数据集,原因在于缺乏大规模真实扫描的3D数据库。为促进真实世界中3D感知、重建与生成技术的发展,我们提出OmniObject3D——一个包含海量高质量真实扫描3D物体的大词汇量数据集。OmniObject3D具有以下显著特性:1)大词汇量:涵盖190个日常类别的6000个扫描物体,与主流2D数据集(如ImageNet和LVIS)共享常见类别,有利于追求可泛化的3D表征。2)丰富标注:每个3D物体均通过2D与3D传感器采集,提供带纹理网格、点云、多视角渲染图像及多组真实拍摄视频。3)真实扫描:专业扫描仪确保物体扫描的精确形状与真实外观。借助OmniObject3D提供的广阔探索空间,我们精心设置了四个评估方向:a)鲁棒3D感知、b)新视角合成、c)神经表面重建及d)3D物体生成。针对这四个基准进行了广泛研究,揭示了真实3D视觉未来研究中的新发现、挑战与机遇。