Metaverse technologies demand accurate, real-time, and immersive modeling on consumer-grade hardware for both non-human perception (e.g., drone/robot/autonomous car navigation) and immersive technologies like AR/VR, requiring both structural accuracy and photorealism. However, there exists a knowledge gap in how to apply geometric reconstruction and photorealism modeling (novel view synthesis) in a unified framework. To address this gap and promote the development of robust and immersive modeling and rendering with consumer-grade devices, we propose a real-world Multi-Sensor Hybrid Room Dataset (MuSHRoom). Our dataset presents exciting challenges and requires state-of-the-art methods to be cost-effective, robust to noisy data and devices, and can jointly learn 3D reconstruction and novel view synthesis instead of treating them as separate tasks, making them ideal for real-world applications. We benchmark several famous pipelines on our dataset for joint 3D mesh reconstruction and novel view synthesis. Our dataset and benchmark show great potential in promoting the improvements for fusing 3D reconstruction and high-quality rendering in a robust and computationally efficient end-to-end fashion. The dataset and code are available at the project website: https://xuqianren.github.io/publications/MuSHRoom/.
翻译:摘要:元宇宙技术要求在消费级硬件上实现高精度、实时且沉浸式的建模,既服务于非人类感知领域(如无人机/机器人/自动驾驶导航),也支持AR/VR等沉浸式技术,需要兼顾几何准确性与视觉真实感。然而,当前研究在如何将几何重建与真实感建模(新视角合成)统一于同一框架中存在认知空白。为填补这一空白,并推动基于消费级设备的鲁棒性沉浸式建模与渲染技术发展,我们提出了真实世界的多传感器混合室内数据集(MuSHRoom)。该数据集提出了多项具有挑战性的任务,要求现有最优方法需具备高性价比、对噪声数据与设备具有鲁棒性,并能联合学习三维重建与新视角合成(而非将其视为独立任务),从而使其适用于真实场景应用。我们在该数据集上对多种主流流水线进行了联合三维网格重建与新视角合成的基准测试。实验表明,本数据集与基准测试在促进三维重建与高质量渲染的鲁棒、高效端到端融合方面具有巨大潜力。数据集与代码已发布于项目网站:https://xuqianren.github.io/publications/MuSHRoom/。