The field of 3D reconstruction from images has rapidly evolved in the past few years, first with the introduction of Neural Radiance Field (NeRF) and more recently with 3D Gaussian Splatting (3DGS). The latter provides a significant edge over NeRF in terms of the training and inference speed, as well as the reconstruction quality. Although 3DGS works well for dense input images, the unstructured point-cloud like representation quickly overfits to the more challenging setup of extremely sparse input images (e.g., 3 images), creating a representation that appears as a jumble of needles from novel views. To address this issue, we propose regularized optimization and depth-based initialization. Our key idea is to introduce a structured Gaussian representation that can be controlled in 2D image space. We then constraint the Gaussians, in particular their position, and prevent them from moving independently during optimization. Specifically, we introduce single and multiview constraints through an implicit convolutional decoder and a total variation loss, respectively. With the coherency introduced to the Gaussians, we further constrain the optimization through a flow-based loss function. To support our regularized optimization, we propose an approach to initialize the Gaussians using monocular depth estimates at each input view. We demonstrate significant improvements compared to the state-of-the-art sparse-view NeRF-based approaches on a variety of scenes.
翻译:基于图像的二维重建领域在过去几年中迅速发展,先是神经辐射场(NeRF)的提出,最近又出现了三维高斯溅射(3DGS)方法。后者在训练推理速度与重建质量方面均显著优于NeRF。尽管3DGS在密集输入图像条件下表现良好,但其非结构化点云式表征极易在极端稀疏输入图像(如3张图像)的挑战性场景中过拟合,导致新视角下呈现杂乱针状伪影。为解决此问题,我们提出正则化优化与基于深度的初始化策略。核心思想是构建一种可在二维图像空间中控制的结构化高斯表征,通过约束高斯分布(特别是其位置参数)以阻止其在优化过程中独立移动。具体而言,我们分别通过隐式卷积解码器与全变分损失引入单视图与多视图约束。在建立高斯分布相干性的基础上,进一步采用基于光流的损失函数约束优化过程。为支撑正则化优化,我们提出利用各输入视角的单目深度估计值初始化高斯分布的方法。实验表明,该方法在多种场景下相较当前最先进的稀疏视角NeRF方法均取得显著提升。