Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications. However, current approaches suffer from drawbacks such as struggling with large scene deformations, inaccurate shape completion or requiring 2D point tracks. In contrast, our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can operate on monocular RGB videos directly by using differentiable volume rendering. This technique includes three new in the context of non-rigid 3D reconstruction components, i.e., 1) A coordinate-based and implicit neural representation for non-rigid scenes, which in conjunction with differentiable volume rendering enables an unbiased reconstruction of dynamic scenes, 2) a proof that extends the unbiased formulation of volume rendering to dynamic scenes, and 3) a novel dynamic scene flow loss, which enables the reconstruction of larger deformations by leveraging the coarse estimates of other methods. Results on our new dataset, which will be made publicly available, demonstrate a clear improvement over the state of the art in terms of surface reconstruction accuracy and robustness to large deformations.
翻译:从单目RGB视频中捕捉一般性变形场景对许多计算机图形学和视觉应用至关重要。然而,现有方法存在难以处理大尺度场景变形、形状补全不准确或需要2D点轨迹等问题。相比之下,我们的方法Ub4D能够处理大变形,在遮挡区域实现形状补全,并可直接通过可微体渲染处理单目RGB视频。该技术包含三个非刚性3D重建组件的新贡献:1)基于坐标的隐式神经表示方法用于非刚性场景,结合可微体渲染实现动态场景的无偏重建;2)将体渲染的无偏公式扩展到动态场景的理论证明;3)新颖的动态场景流损失函数,通过利用其他方法的粗糙估计实现更大变形的重建。在我们即将公开的新数据集上的实验结果表明,该方法在曲面重建精度和对大变形的鲁棒性方面显著优于现有技术。