This paper presents one of the first learning-based NeRF 3D instance segmentation pipelines, dubbed as Instance Neural Radiance Field, or Instance NeRF. Taking a NeRF pretrained from multi-view RGB images as input, Instance NeRF can learn 3D instance segmentation of a given scene, represented as an instance field component of the NeRF model. To this end, we adopt a 3D proposal-based mask prediction network on the sampled volumetric features from NeRF, which generates discrete 3D instance masks. The coarse 3D mask prediction is then projected to image space to match 2D segmentation masks from different views generated by existing panoptic segmentation models, which are used to supervise the training of the instance field. Notably, beyond generating consistent 2D segmentation maps from novel views, Instance NeRF can query instance information at any 3D point, which greatly enhances NeRF object segmentation and manipulation. Our method is also one of the first to achieve such results without ground-truth instance information during inference. Experimented on synthetic and real-world NeRF datasets with complex indoor scenes, Instance NeRF surpasses previous NeRF segmentation works and competitive 2D segmentation methods in segmentation performance on unseen views. See the demo video at https://youtu.be/wW9Bme73coI.
翻译:本文提出了首个基于学习的NeRF三维实例分割管线,名为实例神经辐射场(Instance Neural Radiance Field,简称Instance NeRF)。该方法以从多视角RGB图像预训练的NeRF模型作为输入,能够学习给定场景的三维实例分割,并以NeRF模型的实例场分量形式表示。为此,我们在NeRF的采样体素特征上采用基于三维提案的掩码预测网络,生成离散的三维实例掩码。随后,将粗粒度的三维掩码预测投影至图像空间,与现有全景分割模型生成的不同视角二维分割掩码进行匹配,用于监督实例场的训练。值得注意的是,除了能从新视角生成一致的二维分割图外,Instance NeRF还可查询任意三维点的实例信息,极大增强了NeRF的对象分割与操作能力。我们的方法也是首批在推理阶段无需真实实例信息即可取得此类结果的方法之一。在包含复杂室内场景的合成与真实NeRF数据集上的实验表明,Instance NeRF在未见视角的分割性能上超越了先前的NeRF分割方法及具有竞争力的二维分割方法。演示视频见https://youtu.be/wW9Bme73coI。