Neural Radiance Fields (NeRFs) have recently emerged as a popular option for photo-realistic object capture due to their ability to faithfully capture high-fidelity volumetric content even from handheld video input. Although much research has been devoted to efficient optimization leading to real-time training and rendering, options for interactive editing NeRFs remain limited. We present a very simple but effective neural network architecture that is fast and efficient while maintaining a low memory footprint. This architecture can be incrementally guided through user-friendly image-based edits. Our representation allows straightforward object selection via semantic feature distillation at the training stage. More importantly, we propose a local 3D-aware image context to facilitate view-consistent image editing that can then be distilled into fine-tuned NeRFs, via geometric and appearance adjustments. We evaluate our setup on a variety of examples to demonstrate appearance and geometric edits and report 10-30x speedup over concurrent work focusing on text-guided NeRF editing. Video results can be seen on our project webpage at https://proteusnerf.github.io.
翻译:神经辐射场(NeRF)近期因能忠实捕捉手持视频输入中的高保真体积内容,已成为照片级真实感物体捕捉的热门选择。尽管大量研究致力于高效优化以实现实时训练与渲染,但支持交互式编辑NeRF的选项仍十分有限。我们提出一种极其简单但有效的神经网络架构,兼具快速高效与低内存占用的特性。该架构可通过用户友好的图像编辑进行增量引导。我们的表示方法允许在训练阶段通过语义特征蒸馏实现直观的对象选择。更重要的是,我们提出一种局部三维感知图像上下文,以促进视图一致的图像编辑,进而通过几何与外观调整将其蒸馏至微调后的NeRF中。我们通过多种实例评估了该框架在外观和几何编辑上的表现,并报告相比同期聚焦文本引导NeRF编辑的工作实现了10-30倍的加速。视频结果可见于项目网页:https://proteusnerf.github.io。