Recent advancements in 3D object reconstruction have been remarkable, yet most current 3D models rely heavily on existing 3D datasets. The scarcity of diverse 3D datasets results in limited generalization capabilities of 3D reconstruction models. In this paper, we propose a novel framework for boosting 3D reconstruction with multi-view refinement (MVBoost) by generating pseudo-GT data. The key of MVBoost is combining the advantages of the high accuracy of the multi-view generation model and the consistency of the 3D reconstruction model to create a reliable data source. Specifically, given a single-view input image, we employ a multi-view diffusion model to generate multiple views, followed by a large 3D reconstruction model to produce consistent 3D data. MVBoost then adaptively refines these multi-view images, rendered from the consistent 3D data, to build a large-scale multi-view dataset for training a feed-forward 3D reconstruction model. Additionally, the input view optimization is designed to optimize the corresponding viewpoints based on the user's input image, ensuring that the most important viewpoint is accurately tailored to the user's needs. Extensive evaluations demonstrate that our method achieves superior reconstruction results and robust generalization compared to prior works.
翻译:近年来,三维物体重建技术取得了显著进展,然而当前大多数三维模型严重依赖现有的三维数据集。多样化三维数据集的稀缺性导致三维重建模型的泛化能力有限。本文提出了一种新颖的框架,通过生成伪真实数据,利用多视角细化来提升三维重建性能(MVBoost)。MVBoost的关键在于结合多视角生成模型的高精度优势与三维重建模型的一致性优势,以创建可靠的数据源。具体而言,给定单视角输入图像,我们首先采用多视角扩散模型生成多个视角,随后利用大型三维重建模型生成一致的三维数据。MVBoost进而自适应地细化这些从一致三维数据渲染出的多视角图像,以构建大规模多视角数据集,用于训练前馈式三维重建模型。此外,本文设计了输入视角优化方法,基于用户输入图像优化对应的视点,确保最重要的视角能够精确适应用户需求。大量评估表明,与现有方法相比,我们的方法在重建结果和鲁棒泛化能力方面均表现出优越性。