Hand avatars play a pivotal role in a wide array of digital interfaces, enhancing user immersion and facilitating natural interaction within virtual environments. While previous studies have focused on photo-realistic hand rendering, little attention has been paid to reconstruct the hand geometry with fine details, which is essential to rendering quality. In the realms of extended reality and gaming, on-the-fly rendering becomes imperative. To this end, we introduce an expressive hand avatar, named XHand, that is designed to comprehensively generate hand shape, appearance, and deformations in real-time. To obtain fine-grained hand meshes, we make use of three feature embedding modules to predict hand deformation displacements, albedo, and linear blending skinning weights, respectively. To achieve photo-realistic hand rendering on fine-grained meshes, our method employs a mesh-based neural renderer by leveraging mesh topological consistency and latent codes from embedding modules. During training, a part-aware Laplace smoothing strategy is proposed by incorporating the distinct levels of regularization to effectively maintain the necessary details and eliminate the undesired artifacts. The experimental evaluations on InterHand2.6M and DeepHandMesh datasets demonstrate the efficacy of XHand, which is able to recover high-fidelity geometry and texture for hand animations across diverse poses in real-time. To reproduce our results, we will make the full implementation publicly available at https://github.com/agnJason/XHand.
翻译:手部化身在众多数字界面中扮演着关键角色,能够增强用户在虚拟环境中的沉浸感并促进自然交互。以往研究多集中于照片级真实感的手部渲染,却鲜有关注重建具有精细细节的手部几何结构,而这对于渲染质量至关重要。在扩展现实和游戏领域,实时渲染变得极为必要。为此,我们提出了一种名为XHand的表达性手部化身,旨在实时综合生成手部形状、外观及形变。为获取细粒度手部网格,我们利用三个特征嵌入模块分别预测手部形变位移、反照率及线性混合蒙皮权重。为实现细粒度网格上的照片级真实感手部渲染,我们的方法借助网格拓扑一致性和嵌入模块的潜在编码,采用了一种基于网格的神经渲染器。在训练过程中,我们提出了一种部件感知的拉普拉斯平滑策略,通过融入不同级别的正则化,有效保留了必要的细节并消除了不希望的伪影。在InterHand2.6M和DeepHandMesh数据集上的实验评估证明了XHand的有效性,它能够实时恢复多种姿态手部动画的高保真几何与纹理。为复现我们的结果,我们将把完整实现公开于https://github.com/agnJason/XHand。