Neural Radiance Fields (NeRF) can generate highly realistic novel views. However, editing 3D scenes represented by NeRF across 360-degree views, particularly removing objects while preserving geometric and photometric consistency, remains a challenging problem due to NeRF's implicit scene representation. In this paper, we propose InpaintNeRF360, a unified framework that utilizes natural language instructions as guidance for inpainting NeRF-based 3D scenes.Our approach employs a promptable segmentation model by generating multi-modal prompts from the encoded text for multiview segmentation. We apply depth-space warping to enforce viewing consistency in the segmentations, and further refine the inpainted NeRF model using perceptual priors to ensure visual plausibility. InpaintNeRF360 is capable of simultaneously removing multiple objects or modifying object appearance based on text instructions while synthesizing 3D viewing-consistent and photo-realistic inpainting. Through extensive experiments on both unbounded and frontal-facing scenes trained through NeRF, we demonstrate the effectiveness of our approach and showcase its potential to enhance the editability of implicit radiance fields.
翻译:神经辐射场(NeRF)能够生成高度逼真的新视角图像。然而,在360度视角下编辑由NeRF表示的三维场景——特别是移除物体同时保持几何与光度一致性——仍是一个具有挑战性的问题,这源于NeRF的隐式场景表示。本文提出InpaintNeRF360,一个统一框架,利用自然语言指令作为指导,对基于NeRF的三维场景进行修复。我们的方法通过从编码文本中生成多模态提示,采用可提示分割模型进行多视图分割。我们应用深度空间扭曲来强制分割中的视角一致性,并进一步利用感知先验优化修复后的NeRF模型,以确保视觉合理性。InpaintNeRF360能够基于文本指令同时移除多个物体或修改物体外观,同时合成三维视角一致且照片级真实的修复效果。通过在通过NeRF训练的无界场景和正面朝向场景上的广泛实验,我们证明了该方法的有效性,并展示了其增强隐式辐射场可编辑性的潜力。