We address efficient and structure-aware 3D scene representation from images. Nerflets are our key contribution -- a set of local neural radiance fields that together represent a scene. Each nerflet maintains its own spatial position, orientation, and extent, within which it contributes to panoptic, density, and radiance reconstructions. By leveraging only photometric and inferred panoptic image supervision, we can directly and jointly optimize the parameters of a set of nerflets so as to form a decomposed representation of the scene, where each object instance is represented by a group of nerflets. During experiments with indoor and outdoor environments, we find that nerflets: (1) fit and approximate the scene more efficiently than traditional global NeRFs, (2) allow the extraction of panoptic and photometric renderings from arbitrary views, and (3) enable tasks rare for NeRFs, such as 3D panoptic segmentation and interactive editing.
翻译:我们致力于从图像中实现高效且结构感知的三维场景表示。核心贡献在于提出Nerflets——一组局部神经辐射场,共同表征一个场景。每个Nerflet维护自身的空间位置、朝向和范围,并在其作用域内参与全景、密度和辐射重建。通过仅利用光度监督和推断得到的全景图像监督,我们能够直接联合优化一组Nerflet的参数,从而形成场景的分解式表示:每个物体实例由一组Nerflet表示。在室内与室外环境的实验中,我们发现Nerflet:(1)比传统全局NeRF更高效地拟合与逼近场景;(2)支持从任意视角提取全景和光度渲染结果;(3)能够实现NeRF难以完成的任务,例如三维全景分割与交互式编辑。