Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consistency across frames of a video is hard to maintain, even in studio environments, thus making the created reanimatable neural portraits prone to artifacts during reanimation. In this work, we propose CoDyNeRF, a system that enables the creation of fully controllable 3D portraits in real-world capture conditions. CoDyNeRF learns to approximate illumination dependent effects via a dynamic appearance model in the canonical space that is conditioned on predicted surface normals and the facial expressions and head-pose deformations. The surface normals prediction is guided using 3DMM normals that act as a coarse prior for the normals of the human head, where direct prediction of normals is hard due to rigid and non-rigid deformations induced by head-pose and facial expression changes. Using only a smartphone-captured short video of a subject for training, we demonstrate the effectiveness of our method on free view synthesis of a portrait scene with explicit head pose and expression controls, and realistic lighting effects. The project page can be found here: http://shahrukhathar.github.io/2023/08/22/CoDyNeRF.html
翻译:神经辐射场(NeRFs)的最新进展使得重建和重现动态肖像场景成为可能,并能控制头部姿态、面部表情和视角方向。然而,训练此类模型假设变形区域(例如,面部)在随头部姿态和面部表情变化时具有光度一致性。即使在工作室环境中,也难以保持视频帧间的这种光度一致性,因此创建的可重现神经肖像在重现过程中容易出现伪影。在这项工作中,我们提出了CoDyNeRF,一个能够在真实世界捕获条件下创建完全可控3D肖像的系统。CoDyNeRF通过在规范空间中学习一个动态外观模型来近似光照相关效应,该模型以预测的表面法线、面部表情和头部姿态变形为条件。表面法线预测由3DMM法线引导,这些法线作为人类头部法线的粗略先验,由于头部姿态和面部表情变化引起的刚性和非刚性变形,直接预测法线是困难的。仅使用智能手机拍摄的短时视频进行训练,我们展示了该方法在具有明确头部姿态和表情控制以及逼真光照效果的肖像场景自由视角合成中的有效性。项目页面可访问:http://shahrukhathar.github.io/2023/08/22/CoDyNeRF.html