We present InstantGeoAvatar, a method for efficient and effective learning from monocular video of detailed 3D geometry and appearance of animatable implicit human avatars. Our key observation is that the optimization of a hash grid encoding to represent a signed distance function (SDF) of the human subject is fraught with instabilities and bad local minima. We thus propose a principled geometry-aware SDF regularization scheme that seamlessly fits into the volume rendering pipeline and adds negligible computational overhead. Our regularization scheme significantly outperforms previous approaches for training SDFs on hash grids. We obtain competitive results in geometry reconstruction and novel view synthesis in as little as five minutes of training time, a significant reduction from the several hours required by previous work. InstantGeoAvatar represents a significant leap forward towards achieving interactive reconstruction of virtual avatars.
翻译:我们提出InstantGeoAvatar,一种从单目视频中高效学习可动画化隐式人体虚拟人精细三维几何与外观的方法。我们的核心发现是:采用哈希网格编码优化人体符号距离函数(SDF)表示时,会面临严重的不稳定性和不良局部极小值问题。为此,我们提出一种基于几何先验的SDF正则化方案,该方案可无缝集成到体渲染流程中,且计算开销可忽略不计。我们的正则化方案在哈希网格SDF训练任务上显著优于现有方法。仅需五分钟训练时间即可在几何重建与新视角合成任务上获得具有竞争力的结果,这较先前工作所需的数小时训练时长实现了显著缩减。InstantGeoAvatar标志着向实现交互式虚拟人重建目标迈出了重要一步。