The human hand is the main medium through which we interact with our surroundings. Hence, its digitization is of uttermost importance, with direct applications in VR/AR, gaming, and media production amongst other areas. While there are several works for modeling the geometry and articulations of hands, little attention has been dedicated to capturing photo-realistic appearance. In addition, for applications in extended reality and gaming, real-time rendering is critical. In this work, we present the first neural-implicit approach to photo-realistically render hands in real-time. This is a challenging problem as hands are textured and undergo strong articulations with various pose-dependent effects. However, we show that this can be achieved through our carefully designed method. This includes training on a low-resolution rendering of a neural radiance field, together with a 3D-consistent super-resolution module and mesh-guided space canonicalization and sampling. In addition, we show the novel application of a perceptual loss on the image space is critical for achieving photorealism. We show rendering results for several identities, and demonstrate that our method captures pose- and view-dependent appearance effects. We also show a live demo of our method where we photo-realistically render the human hand in real-time for the first time in literature. We ablate all our design choices and show that our design optimizes for both photorealism and rendering speed. Our code will be released to encourage further research in this area.
翻译:人手是我们与环境互动的主要媒介。因此,其数字化具有极其重要的意义,可直接应用于VR/AR、游戏和媒体制作等领域。尽管已有多种方法用于建模手的几何形状与关节运动,但对捕捉照片级真实感外观的关注仍然不足。此外,在扩展现实和游戏应用中,实时渲染至关重要。本文提出了首个神经隐式方法,能够实时照片级真实感地渲染人手。这是一个极具挑战性的问题,因为手部带有纹理,且在不同姿态下会产生强烈的关节运动及多样的姿态相关效应。然而,我们通过精心设计的方法证明了这一目标可以实现。该方法包括在神经辐射场的低分辨率渲染上进行训练,结合一个保持三维一致性的超分辨率模块,以及网格引导的空间规范化与采样。此外,我们展示了在图像空间应用感知损失的新颖设计对实现照片级真实感至关重要。我们展示了多个身份对象的渲染结果,证明了该方法能够捕捉姿态依赖和视角依赖的外观效应。我们还提供了方法的实时演示,首次在文献中实现了对人手进行实时照片级真实感渲染。我们对所有设计选择进行了消融实验,结果表明我们的设计同时优化了照片级真实感与渲染速度。我们的代码将开源,以促进该领域的进一步研究。