GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting

from arxiv, ACM Transactions on Graphics (SIGGRAPH Asia 2024). Project page: https://gaussianobject.github.io/ Code: https://github.com/chensjtu/GaussianObject

Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination, which explicitly inject structure priors into the initial optimization process to help build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. We further design a COLMAP-free variant, where pre-given accurate camera poses are not required, which achieves competitive quality and facilitates wider applications. GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images, achieving superior performance from only four views and significantly outperforming previous SOTA methods. Our demo is available at https://gaussianobject.github.io/, and the code has been released at https://github.com/GaussianObject/GaussianObject.

翻译：从高度稀疏的视角重建并渲染三维物体对于推动三维视觉技术应用和提升用户体验至关重要。然而，稀疏视角图像仅包含极其有限的三维信息，导致两大挑战：1）因匹配图像过少而难以建立多视角一致性；2）因视角覆盖不足导致物体信息部分缺失或高度压缩。为应对这些挑战，我们提出高斯对象——一个基于高斯泼溅技术实现三维物体表征与渲染的框架，仅需4张输入图像即可获得高质量渲染效果。我们首先引入视觉外壳与浮动体消除技术，在初始优化过程中显式注入结构先验以辅助建立多视角一致性，从而生成粗糙的三维高斯表征。随后构建基于扩散模型的高斯修复模型来补全缺失的物体信息，并对高斯表征进行精细化处理。我们设计了自生成策略来获取训练修复模型所需的图像对。进一步开发了无需COLMAP的变体方案，该方案不依赖预先给定的精确相机位姿，在保持竞争力的质量同时拓展了应用范围。高斯对象在多个挑战性数据集（包括MipNeRF360、OmniObject3D、OpenIllumination及我们采集的无位姿图像集）上进行了评估，仅用四个视角即实现卓越性能，显著超越现有SOTA方法。演示视频详见https://gaussianobject.github.io/，代码已发布于https://github.com/GaussianObject/GaussianObject。