Although Neural Radiance Fields (NeRF) is popular in the computer vision community recently, registering multiple NeRFs has yet to gain much attention. Unlike the existing work, NeRF2NeRF, which is based on traditional optimization methods and needs human annotated keypoints, we propose DReg-NeRF to solve the NeRF registration problem on object-centric scenes without human intervention. After training NeRF models, our DReg-NeRF first extracts features from the occupancy grid in NeRF. Subsequently, our DReg-NeRF utilizes a transformer architecture with self-attention and cross-attention layers to learn the relations between pairwise NeRF blocks. In contrast to state-of-the-art (SOTA) point cloud registration methods, the decoupled correspondences are supervised by surface fields without any ground truth overlapping labels. We construct a novel view synthesis dataset with 1,700+ 3D objects obtained from Objaverse to train our network. When evaluated on the test set, our proposed method beats the SOTA point cloud registration methods by a large margin, with a mean $\text{RPE}=9.67^{\circ}$ and a mean $\text{RTE}=0.038$. Our code is available at https://github.com/AIBluefisher/DReg-NeRF.
翻译:尽管神经辐射场(NeRF)近期在计算机视觉领域广受欢迎,但对多个NeRF进行配准的研究尚未得到广泛关注。与基于传统优化方法且需要人工标注关键点的现有工作NeRF2NeRF不同,我们提出DReg-NeRF来解决以物体为中心场景中无需人工干预的NeRF配准问题。在训练NeRF模型后,DReg-NeRF首先从NeRF的占据网格中提取特征。随后,DReg-NeRF利用包含自注意力与交叉注意力层的Transformer架构来学习成对NeRF模块之间的关联。与当前最先进的(SOTA)点云配准方法相比,解耦后的对应关系由表面场进行监督,无需任何真实重叠标签。我们构建了一个包含从Objaverse获取的1700余个3D物体的新型视图合成数据集用于网络训练。在测试集上的评估显示,所提方法大幅超越SOTA点云配准方法,平均旋转误差(RPE)为9.67°,平均平移误差(RTE)为0.038。我们的代码已开源至https://github.com/AIBluefisher/DReg-NeRF。