Face rendering using neural radiance fields (NeRF) is a rapidly developing research area in computer vision. While recent methods primarily focus on controlling facial attributes such as identity and expression, they often overlook the crucial aspect of modeling eyeball rotation, which holds importance for various downstream tasks. In this paper, we aim to learn a face NeRF model that is sensitive to eye movements from multi-view images. We address two key challenges in eye-aware face NeRF learning: how to effectively capture eyeball rotation for training and how to construct a manifold for representing eyeball rotation. To accomplish this, we first fit FLAME, a well-established parametric face model, to the multi-view images considering multi-view consistency. Subsequently, we introduce a new Dynamic Eye-aware NeRF (DeNeRF). DeNeRF transforms 3D points from different views into a canonical space to learn a unified face NeRF model. We design an eye deformation field for the transformation, including rigid transformation, e.g., eyeball rotation, and non-rigid transformation. Through experiments conducted on the ETH-XGaze dataset, we demonstrate that our model is capable of generating high-fidelity images with accurate eyeball rotation and non-rigid periocular deformation, even under novel viewing angles. Furthermore, we show that utilizing the rendered images can effectively enhance gaze estimation performance.
翻译:基于神经辐射场(NeRF)的人脸渲染是计算机视觉中一个快速发展的研究领域。尽管现有方法主要关注身份和表情等面部属性的控制,但常常忽略对眼球旋转这一关键因素的建模,而后者对于各类下游任务具有重要意义。本文旨在从多视角图像中学习对眼部运动敏感的人脸NeRF模型。我们解决了眼部感知人脸NeRF学习中的两个关键挑战:如何有效捕获眼球旋转用于训练,以及如何构建表征眼球旋转的流形。为实现这一目标,我们首先将成熟的参数化人脸模型FLAME拟合到多视角图像中,并考虑多视角一致性。随后,我们提出了一种新型动态眼部感知NeRF(DeNeRF)。DeNeRF将不同视角的三维点变换至规范空间,以学习统一的人脸NeRF模型。我们为这一变换设计了眼部形变场,包括刚性变换(如眼球旋转)和非刚性变换。通过在ETH-XGaze数据集上的实验,我们证明该模型能够生成高保真图像,即使在新的视角下也能准确呈现眼球旋转和非刚性眼周形变。此外,我们展示了利用渲染图像可有效提升凝视估计性能。