Obtaining personalized 3D animatable avatars from a monocular camera has several real world applications in gaming, virtual try-on, animation, and VR/XR, etc. However, it is very challenging to model dynamic and fine-grained clothing deformations from such sparse data. Existing methods for modeling 3D humans from depth data have limitations in terms of computational efficiency, mesh coherency, and flexibility in resolution and topology. For instance, reconstructing shapes using implicit functions and extracting explicit meshes per frame is computationally expensive and cannot ensure coherent meshes across frames. Moreover, predicting per-vertex deformations on a pre-designed human template with a discrete surface lacks flexibility in resolution and topology. To overcome these limitations, we propose a novel method Neural Surface Fields for modeling 3D clothed humans from monocular depth. NSF defines a neural field solely on the base surface which models a continuous and flexible displacement field. NSF can be adapted to the base surface with different resolution and topology without retraining at inference time. Compared to existing approaches, our method eliminates the expensive per-frame surface extraction while maintaining mesh coherency, and is capable of reconstructing meshes with arbitrary resolution without retraining. To foster research in this direction, we release our code in project page at: https://yuxuan-xue.com/nsf.
翻译:从单目相机获取个性化的可动画三维虚拟人像在游戏、虚拟试穿、动画及VR/XR等实际应用中具有重要价值。然而,从这种稀疏数据中建模动态且精细的服装形变极具挑战性。现有的基于深度数据的三维人体建模方法在计算效率、网格连贯性以及分辨率与拓扑结构的灵活性方面存在局限性。例如,使用隐式函数重建形状并逐帧提取显式网格计算成本高,且无法保证帧间网格连贯性。此外,在预设人体模板上基于离散表面逐顶点预测形变的方法,在分辨率与拓扑结构上缺乏灵活性。为克服这些局限,我们提出一种新方法——神经表面场(Neural Surface Fields, NSF),用于从单目深度数据建模三维穿衣人体。NSF在基表面上定义纯神经场,从而建模连续且灵活的位移场。NSF可适配不同分辨率与拓扑结构的基表面,无需在推理时重新训练。与现有方法相比,我们的方法消除了昂贵的逐帧表面提取过程,同时保持网格连贯性,并能无需重训练地重建任意分辨率的网格。为促进该方向的研究,我们在项目页面发布了代码:https://yuxuan-xue.com/nsf。