Neural radiance fields (NeRFs) are promising 3D representations for scenes, objects, and humans. However, most existing methods require multi-view inputs and per-scene training, which limits their real-life applications. Moreover, current methods focus on single-subject cases, leaving scenes of interacting hands that involve severe inter-hand occlusions and challenging view variations remain unsolved. To tackle these issues, this paper proposes a generalizable visibility-aware NeRF (VA-NeRF) framework for interacting hands. Specifically, given an image of interacting hands as input, our VA-NeRF first obtains a mesh-based representation of hands and extracts their corresponding geometric and textural features. Subsequently, a feature fusion module that exploits the visibility of query points and mesh vertices is introduced to adaptively merge features of both hands, enabling the recovery of features in unseen areas. Additionally, our VA-NeRF is optimized together with a novel discriminator within an adversarial learning paradigm. In contrast to conventional discriminators that predict a single real/fake label for the synthesized image, the proposed discriminator generates a pixel-wise visibility map, providing fine-grained supervision for unseen areas and encouraging the VA-NeRF to improve the visual quality of synthesized images. Experiments on the Interhand2.6M dataset demonstrate that our proposed VA-NeRF outperforms conventional NeRFs significantly. Project Page: \url{https://github.com/XuanHuang0/VANeRF}.
翻译:[translated abstract in Chinese]神经辐射场(NeRFs)是用于场景、物体和人类的三维表示的有效方法。然而,现有方法大多需要多视角输入和逐场景训练,这限制了其实际应用。此外,当前方法集中于单主体情况,而涉及严重手间遮挡和具有挑战性的视角变化的交互手部场景仍未得到解决。针对这些问题,本文提出了一种可泛化的可见性感知NeRF(VA-NeRF)框架来处理交互手部场景。具体而言,给定一张交互手部图像作为输入,我们的VA-NeRF首先获取基于网格的手部表示,并提取对应的几何与纹理特征。随后,引入一个利用查询点和网格顶点可见性的特征融合模块,自适应地合并双手的特征,从而恢复未观测区域的特征。此外,我们的VA-NeRF与一种新型判别器在对抗学习范式下协同优化。与为合成图像预测单一真/假标签的传统判别器不同,所提出的判别器生成像素级可见性图,为未观测区域提供细粒度监督,并促使VA-NeRF提升合成图像的视觉质量。在Interhand2.6M数据集上的实验表明,我们提出的VA-NeRF显著优于传统NeRF。项目页面:\url{https://github.com/XuanHuang0/VANeRF}。