While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses. To relax this constraint, multiple efforts have been made to train Neural Radiance Fields (NeRFs) without pre-processed camera poses. However, the implicit representations of NeRFs provide extra challenges to optimize the 3D structure and camera poses at the same time. On the other hand, the recently proposed 3D Gaussian Splatting provides new opportunities given its explicit point cloud representations. This paper leverages both the explicit geometric representation and the continuity of the input video stream to perform novel view synthesis without any SfM preprocessing. We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time, without the need to pre-compute the camera poses. Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes. Our project page is https://oasisyang.github.io/colmap-free-3dgs
翻译:尽管神经渲染在场景重建和新视角合成方面取得了令人瞩目的进展,但它高度依赖于精确预计算的相机位姿。为放宽这一约束,已有多种方法尝试在不依赖预处理的相机位姿的情况下训练神经辐射场(NeRF)。然而,NeRF的隐式表示给同时优化三维结构和相机位姿带来了额外的挑战。另一方面,最近提出的三维高斯泼溅(3D Gaussian Splatting)凭借其显式点云表示提供了新的机遇。本文利用显式几何表示与输入视频流的连续性,在无需任何SfM预处理的情况下实现新视角合成。我们以顺序方式处理输入帧,并逐步扩展三维高斯集合——每次处理一帧输入图像,无需预计算相机位姿。在较大运动变化场景下,我们的方法在新视角合成和相机位姿估计方面显著优于先前方法。项目页面:https://oasisyang.github.io/colmap-free-3dgs