The human hand is the main medium through which we interact with our surroundings, making its digitization an important problem. Hence, its digitization is of uttermost importance, with direct applications in VR/AR, gaming, and media production amongst other areas. While there are several works modeling the geometry of hands, little attention has been paid to capturing photo-realistic appearance. Moreover, for applications in extended reality and gaming, real-time rendering is critical. We present the first neural-implicit approach to photo-realistically render hands in real-time. This is a challenging problem as hands are textured and undergo strong articulations with pose-dependent effects. However, we show that this aim is achievable through our carefully designed method. This includes training on a low-resolution rendering of a neural radiance field, together with a 3D-consistent super-resolution module and mesh-guided sampling and space canonicalization. We demonstrate a novel application of perceptual loss on the image space, which is critical for learning details accurately. We also show a live demo where we photo-realistically render the human hand in real-time for the first time, while also modeling pose- and view-dependent appearance effects. We ablate all our design choices and show that they optimize for rendering speed and quality. Our code will be released to encourage further research in this area. The supplementary video can be found at: tinyurl.com/46uvujzn
翻译:人类手部是我们与环境交互的主要媒介,因此其数字化是一个重要问题。该问题具有直接应用价值,涵盖虚拟现实/增强现实、游戏及媒体制作等领域。尽管已有若干工作关注手部几何建模,但鲜有研究聚焦于逼真外观的捕捉。此外,在扩展现实与游戏应用中,实时渲染至关重要。本文首次提出一种神经隐式方法,用于实时逼真渲染手部。由于手部具有纹理特征,且存在伴随姿态依赖效应的强关节运动,这是一个具有挑战性的问题。我们通过精心设计的方法证明该目标可以实现,具体包括:在神经辐射场的低分辨率渲染结果上训练,结合三维一致性超分辨率模块、网格引导采样与空间规范化。我们展示了感知损失在图像空间中的新颖应用,这对精确学习细节至关重要。同时首次通过实时演示实现了人类手部的逼真渲染,并建模了姿态依赖与视角依赖的外观效果。我们对所有设计选择进行消融实验,证明其针对渲染速度与质量进行了优化。代码将开源以促进该领域的进一步研究。补充视频见:tinyurl.com/46uvujzn