A practical benefit of implicit visual representations like Neural Radiance Fields (NeRFs) is their memory efficiency: large scenes can be efficiently stored and shared as small neural nets instead of collections of images. However, operating on these implicit visual data structures requires extending classical image-based vision techniques (e.g., registration, blending) from image sets to neural fields. Towards this goal, we propose NeRFuser, a novel architecture for NeRF registration and blending that assumes only access to pre-generated NeRFs, and not the potentially large sets of images used to generate them. We propose registration from re-rendering, a technique to infer the transformation between NeRFs based on images synthesized from individual NeRFs. For blending, we propose sample-based inverse distance weighting to blend visual information at the ray-sample level. We evaluate NeRFuser on public benchmarks and a self-collected object-centric indoor dataset, showing the robustness of our method, including to views that are challenging to render from the individual source NeRFs.
翻译:隐式视觉表示(如神经辐射场NeRF)的一个实用优势在于其内存效率:大规模场景可以以小型神经网络而非图像集合的形式高效存储与共享。然而,处理此类隐式视觉数据结构需要将经典的基于图像的视觉技术(如配准、融合)从图像集扩展到神经场领域。为此,我们提出NeRFuser——一种面向NeRF配准与融合的新颖架构,该架构仅需访问预训练的NeRF,无需依赖生成这些NeRF所需的大规模图像集。我们提出"基于重渲染的配准"技术,通过从单个NeRF合成的图像推断NeRF之间的变换关系。在融合环节,我们提出基于样本的逆距离加权方法,在光线采样层级实现视觉信息的融合。我们在公开基准数据集以及自建的面向物体的室内数据集上评估了NeRFuser,结果表明该方法具有鲁棒性,尤其能处理对单个源NeRF渲染具有挑战性的视角。