Dynamic Neural Radiance Fields (NeRFs) achieve remarkable visual quality when synthesizing novel views of time-evolving 3D scenes. However, the common reliance on backward deformation fields makes reanimation of the captured object poses challenging. Moreover, the state of the art dynamic models are often limited by low visual fidelity, long reconstruction time or specificity to narrow application domains. In this paper, we present a novel method utilizing a point-based representation and Linear Blend Skinning (LBS) to jointly learn a Dynamic NeRF and an associated skeletal model from even sparse multi-view video. Our forward-warping approach achieves state-of-the-art visual fidelity when synthesizing novel views and poses while significantly reducing the necessary learning time when compared to existing work. We demonstrate the versatility of our representation on a variety of articulated objects from common datasets and obtain reposable 3D reconstructions without the need of object-specific skeletal templates. Code will be made available at https://github.com/lukasuz/Articulated-Point-NeRF.
翻译:动态神经辐射场(Dynamic NeRF)在合成时变三维场景的新视角时展现出卓越的视觉质量。然而,其对反向形变场的普遍依赖使得捕获对象的姿态重定向面临挑战。此外,当前最先进的动态模型常受限于低视觉保真度、长重建时间或狭窄应用领域的特殊性。本文提出一种基于点云表示与线性混合蒙皮(LBS)的新方法,能够从稀疏多视角视频中联合学习动态神经辐射场及其关联骨骼模型。我们的前向扭曲方法在合成新视角与姿态时达到最先进的视觉保真度,同时相较于现有工作显著缩短了必要学习时间。我们在通用数据集的多种关节化物体上验证了所提表示的通用性,无需对象特定骨骼模板即可获得可重定向的三维重建。代码将发布于 https://github.com/lukasuz/Articulated-Point-NeRF。