Neural rendering has demonstrated remarkable success in dynamic scene reconstruction. Thanks to the expressiveness of neural representations, prior works can accurately capture the motion and achieve high-fidelity reconstruction of the target object. Despite this, real-world video scenarios often feature large unobserved regions where neural representations struggle to achieve realistic completion. To tackle this challenge, we introduce MorpheuS, a framework for dynamic 360{\deg} surface reconstruction from a casually captured RGB-D video. Our approach models the target scene as a canonical field that encodes its geometry and appearance, in conjunction with a deformation field that warps points from the current frame to the canonical space. We leverage a view-dependent diffusion prior and distill knowledge from it to achieve realistic completion of unobserved regions. Experimental results on various real-world and synthetic datasets show that our method can achieve high-fidelity 360{\deg} surface reconstruction of a deformable object from a monocular RGB-D video.
翻译:神经渲染在动态场景重建中取得了显著成功。得益于神经表示的强大表达能力,现有方法能够精确捕捉运动并实现目标对象的高保真重建。然而,现实视频场景中常存在大量未观测区域,神经表示难以实现对此类区域的真实感补全。为解决这一挑战,我们提出MorpheuS框架——一种基于随意拍摄的RGB-D视频进行动态360°表面重建的方法。该方法将目标场景建模为编码几何与外观的规范场,并联合使用形变场将当前帧的点映射至规范空间。我们利用视角依赖的扩散先验,从中提取知识以实现未观测区域的真实感补全。在多个真实与合成数据集上的实验结果表明,本方法能够从单目RGB-D视频中实现可变形物体的高保真360°表面重建。